Speech Synthesis Based on EEG Signal for Speech Impaired Patients by Using bLSTM Recurrent Neural Network
Downloads
The disability rate in Indonesia is still relatively high and is one of the main health problems which reaches 30.38 million people or 14.2% of the Indonesian population. One of these types of disabilities is speech impairment. There are several possible causes for speech impairment, including the focal disturbance. This situation occurs because of disturbances in the vocal cords caused by injuries due to accidents and other conditions, such as throat cancer, which of course will reduce the productivity of the sufferer. Sign language can be used to communicate, but it still has limitations for normal individuals. In addition, speech synthesis using brain computer interface (BCI) based on electrocorticography (ECoG) has been developed. However, this method still has a weakness, namely invasive and allows the emergence of large enough scar tissue, so that it can reduce the quality of brain biopotential to be recorded. Therefore, a non-invasive EEG-based speech synthesis method was initiated. This method uses bLSTM as one of the components of the RNN model, so that it can construct syllables into words. This system consists of datasets, data filter programs, data segmentation programs, feature extraction programs, ANN and RNN deep learning model training programs, and text-to-speech programs. ANN and RNN form a 2-level deep learning. The testing accuracy and accuracy of the ANN are 26.04% and 20.83%, while the accuracy of the RNN is 81.25%. To improve these results, in the future, researchers can improve the data collection process and increase the number of the data, use the correct extraction feature, and compare several machine learning architectures, to produce optimal accuracy.
Al Ansori, A. N. (2020) ‘Jumlah Penyandang Disabilitas di Indonesia Menurut Kementerian Sosial', Liputan6.com, 10 September. Available at: https://www.liputan6.com/disabilitas/read/4351496/jumlah-penyandang disabilitas-di-indonesia-menurut-kementerian-sosial#:~:text=Berdasarkan data Susenas pada 2018,atau 30%2C38 juta jiwa.
Angrick, M. et al. (2019) ‘Speech synthesis from ECoG using densely connected 3D convolutional neural networks', Journal of neural engineering. 2019/03/04, 16(3), p. 36019. doi: 10.1088/1741-2552/ab0c59.
Anumanchipalli, G.K., Chartier, J. & Chang, E.F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019). https://doi.org/10.1038/s41586-019-1119-1.
Frazer, T.M., 1985. Stress dan Kepuasan Kerja. Terjemahan dari Suwanto. Jakarta: Gramedia.
Pasek Suyadnya, I Wayan et al. Alat Bantu Komunikasi Terintegrasi bagi Penyandang Tuna Wicara Berbasis Sensor Gerak dan OpenWrt. Jurnal SPEKTRUM, [S.l.], v. 5, n. 2, p. 176-182, dec. 2018. ISSN 2684-9186.