Penerapan Metode Convolution Neural Network (CNN) Pada Aplikasi Automatic Lip Reading
Authors
Nimatul Mamuriyah , Jason SumantriDOI:
10.31289/jite.v6i1.7523Published:
2022-07-23Issue:
Vol. 6 No. 1 (2022): Issues July 2022Keywords:
Lip-Reading, Image Processing, Convolutional Neural Network, Machine LearningDownloads
Abstract
Abstract
A Prototype of Automatic Lip-Reading or a prototype of automatic lip-reading is a device that is needed by people with hearing disabilities or the Deaf. The prototype will help people with hearing impairment move independently without depending on others. Advances in Nano technology have driven the development of Computer Vision and software that enables the creation of prototypes of Automatic Lip-Reading. The image processing applied to this prototype is focused on the movement of the lips, tongue, and the area around a person's mouth. Furthermore, the results of the image recognition will compare with the database that has been provided to produce certain words. The prototype design consists of a camera, Python with Tensorflow used as an image processing programming language with the Convolutional Neural Network (CNN) method as an image recognition method, and Machine Learning Technology used as processing and decision-making systems. This study, numbers and alphabets were used as trials or predictions of the Automated Lip-Reading system. By using CNN and Machine Learning methods, the test results show that in general, the system designed can predict numbers and alphabet with not quite high or less than 35%.
Â
Â
References
Alex Krizhevsky, Ilya Sutskever, G. E. H. (2017, June). ImageNet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 84–90. https://cacm.acm.org/magazines/2017/6/217745-imagenet-classification-with-deep-convolutional-neural-networks/fulltext#body-1
Assael, Y. M., Shillingford, B., Whiteson, S., & de Freitas, N. (2016). LipNet: End-to-End Sentence-level Lipreading. January. http://arxiv.org/abs/1611.01599
Cho, J. W., & Park, H. M. (2013). An efficient HMM-based feature enhancement method with filter estimation for reverberant speech recognition. IEEE Signal Processing Letters, 20(12), 1199–1202. https://doi.org/10.1109/LSP.2013.2283585
Feng, D., Yang, S., & Shan, S. (2021). An Efficient Software for Building LIP Reading Models Without Pains. 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).
Gan, Z., Zeng, H., Yang, H., & Zhou, S. (2020). Construction of word level tibetan lip reading dataset. 2020 3rd IEEE International Conference on Information Communication and Signal Processing, ICICSP 2020, 497–501. https://doi.org/10.1109/ICICSP50920.2020.9231973
Ibrahim, M. Z., & Mulvaney, D. J. (2013). Robust geometrical-based lip-reading using hidden Markov models. IEEE EuroCon 2013, July, 2011–2016. https://doi.org/10.1109/EUROCON.2013.6625256
Jiang, S., Ruan, H., Wang, Z., Zhang, H., Zhang, H., & Li, L. (2021). Microwave Lip Reading of Chinese Mandarin Based on Programmable Metasurface. 2021 IEEE MTT-S International Microwave Workshop Series on Advanced Materials and Processes for RF and THz Applications (IMWS-AMP).
Khiatani, D., & Ghose, U. (2018). Weather forecasting using Hidden Markov Model. 2017 International Conference on Computing and Communication Technologies for Smart Nation, IC3TSN 2017, 2017-Octob, 220–225. https://doi.org/10.1109/IC3TSN.2017.8284480
Saitoh, T., & Konishi, R. (2010). Profile lip reading for vowel and word recognition. Proceedings - International Conference on Pattern Recognition, 1356–1359. https://doi.org/10.1109/ICPR.2010.335
Sindhura, P., Preethi, S. J., & Niranjana, K. B. (2018). Convolutional Neural Networks for Predicting Words: A Lip-Reading System. 2018 International Conference on Electrical, Electronics, Communication, Computer, and Optimization Techniques (ICEECCOT).
Statistik, B. P. (2020). Hari Disabilitas Internasional. https://talaudkab.bps.go.id/news/2021/12/03/74/hari-disabilitas-internasional.html
Thein, T., & San, K. M. (2018). Lip movements recognition towards an automatic lip reading system for Myanmar consonants. Proceedings - International Conference on Research Challenges in Information Science, 2018-May(1), 1–6. https://doi.org/10.1109/RCIS.2018.8406660
Tom Hope, Y. S. R. & I. L. (2017). Learning Tensor Flow A Guide to Builging Deep Learning Systems.
Yang, P., Guo, R., Guo, P., & Fang, Z. (2011). Research on lip detection based on Opencv. Proceedings 2011 International Conference on Transportation, Mechanical, and Electrical Engineering, TMEE 2011, 1465–1468. https://doi.org/10.1109/TMEE.2011.6199484
Yao, W. J., Liang, Y. L., & Du, M. H. (2010). A real-time lip localization and tacking for lip reading. ICACTE 2010 - 2010 3rd International Conference on Advanced Computer Theory and Engineering, Proceedings, 6, 363–366. https://doi.org/10.1109/ICACTE.2010.5579830
Yargic, A., & Dogan, M. (2013). A lip reading application on MS Kinect camera. 2013 IEEE International Symposium on Innovations in Intelligent Systems and Applications, IEEE INISTA 2013. https://doi.org/10.1109/INISTA.2013.6577656
Yunxing GaoTianmei Guo, Jiwen Dong, H. L. (2017). Simple Convolution Neural Network on Image Classification. 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA 2017) : March 10-12, 2017, Beijing, China., 721–724.
License
This work is licensed under aCreative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).