Design of Malaysian English Large Vocabulary Continuous Speech Recognizer With Acoustic Model Adaptation

 




 

Yoong, Kah Chung (2022) Design of Malaysian English Large Vocabulary Continuous Speech Recognizer With Acoustic Model Adaptation. Final Year Project (Bachelor), Tunku Abdul Rahman University College.

[img] Text
Yoong Kah Chung.pdf
Restricted to Registered users only

Download (4MB)

Abstract

Speech recognition allows a user to control a computer without using a key, pointer, or any other controls. It also helps users to translate audio to text for better readability or comparison. It could be used in a variety of systems. The companies that develop English speech recognition systems are Siri from Apple and Google. However, most of the speech recognition systems do not support the Malaysian English which has accents from Malaysian mother tongue. The popular speech recognition in market like Apple and Android can only supported US English. Most of speech recognition cannot accurately detect Malaysian English using US English speech recognition. There is the problem for Malaysian user to use English speech recognition effectively. To overcome this problem, the Malaysian English speech recognition is proposed to be implemented and that enables Malaysian to use English speech recognition properly with high accuracy. This proposal studies Malaysian English Speech Recognition system with technology of continuous speech recognition. The Malaysian English Speech Recognition can help to identify the Malaysian English speech and convert it into text. By implementing this system, this system helps Malaysian to use these speech recognition systems with high accuracy and it is adapted with Malaysian English accent. Nevertheless, inappropriate Malaysian English spelling and loud surroundings are two factors that significantly influence speech recognition efficiency. As a result, the continuous speech recognition system is created by combining the US English acoustic model with Maximum a posteriori reasoning (MAP). In feature extraction, the Mel-Frequency Cepstral Coefficients (MFCC) technique would be used, and the Hidden Markov Model can be used as the testing set. In addition, the CMU Sphinx toolkit, which includes Pocketsphinx and Sphinxtrain as well as an acoustic model, was used to create a speech recognition system for Malaysian English. Pocketsphinx is one of the packages included in this proposal to build speech recognition software decoders and Sphinxtrain is an acoustic model training tools for speech recognition system. Malaysian English speech sample will be transcribed to produce the training database required for acoustic model adaptation that collected by many speakers. The outcome of this research could increase the application of Malaysian English speech recognition in Malaysia due to accent problem. The graphical user interface for the Malaysian English Speech Recognition system was created with PyCharm Community Edition and Python 3.9. Furthermore, when analysed with a test script (speeches that is not adapted), speech recognition systems that have experienced the MAP adaptation had the best performance. Its average word error rate achieved was 32.84%. average word recognition rate was 72.52% and average sentence error rate was 78.89%.

Item Type: Final Year Project
Subjects: Technology > Mechanical engineering and machinery
Technology > Electrical engineering. Electronics engineering
Faculties: Faculty of Engineering and Technology > Bachelor of Mechatronics Engineering with Honours
Depositing User: Library Staff
Date Deposited: 04 Mar 2022 07:00
Last Modified: 04 Mar 2022 07:00
URI: https://eprints.tarc.edu.my/id/eprint/20381