Speech Synthesis Based on EEG Signal for Speech Impaired Patients by Using bLSTM Recurrent Neural Network

Speech Impairment EEG bLSTM Speech Synthesis ANN Healthcare

Authors

  • Abdufattah Yurianta
    abdufattah.yurianta-2018@fst.unair.ac.id
    Biomedical Engineering, Physics Department, Faculty of Sciences and Technology, Universitas Airlangga
  • Anaqi Syaddad Ihsan Biomedical Engineering, Physics Department, Faculty of Sciences and Technology, Universitas Airlangga,
  • Arijal Ibnu Jati Biomedical Engineering, Physics Department, Faculty of Sciences and Technology, Universitas Airlangga,
  • Osmalina Nur Rahma Biomedical Engineering, Physics Department, Faculty of Sciences and Technology, Universitas Airlangga,
  • Aji Sapta Pramulen 2Multimedia Broadcasting, Creative Multimedia Engineering Department, Electronic Engineering Polytechnic Institute of Surabaya
October 31, 2022

Downloads

The disability rate in Indonesia is still relatively high and is one of the main health problems which reaches 30.38 million people or 14.2% of the Indonesian population. One of these types of disabilities is speech impairment. There are several possible causes for speech impairment, including the focal disturbance. This situation occurs because of disturbances in the vocal cords caused by injuries due to accidents and other conditions, such as throat cancer, which of course will reduce the productivity of the sufferer. Sign language can be used to communicate, but it still has limitations for normal individuals. In addition, speech synthesis using brain computer interface (BCI) based on electrocorticography (ECoG) has been developed. However, this method still has a weakness, namely invasive and allows the emergence of large enough scar tissue, so that it can reduce the quality of brain biopotential to be recorded. Therefore, a non-invasive EEG-based speech synthesis method was initiated. This method uses bLSTM as one of the components of the RNN model, so that it can construct syllables into words. This system consists of datasets, data filter programs, data segmentation programs, feature extraction programs, ANN and RNN deep learning model training programs, and text-to-speech programs. ANN and RNN form a 2-level deep learning. The testing accuracy and accuracy of the ANN are 26.04% and 20.83%, while the accuracy of the RNN is 81.25%. To improve these results, in the future, researchers can improve the data collection process and increase the number of the data, use the correct extraction feature, and compare several machine learning architectures, to produce optimal accuracy.