Audio-visual Speech Recognition

 




 

Lau, Kai Xian (2019) Audio-visual Speech Recognition. Final Year Project (Bachelor), Tunku Abdul Rahman University College.

[img] Text
Lau Kai Xian.pdf
Restricted to Registered users only

Download (6MB)

Abstract

The Audio-visual speech recognition (AVSR) is a system that integrates visual information and audio information to create a reliable speech recognition system. The AVSR has various applications in practice especially in natural language processing systems such as speech-to-text, automatic translation, or sentiment analysis. Decades ago, the application of Hidden Markov Model (HMM) was frequently used in audio visual speech recognition due to its astonishing achievements in both image and speech recognition. Although the HMM succeeded in achieving high speech recognition rate, the training dataset is enormous because the linguistic coverage has to be sufficient. To overcome this deficiency, a recurrent neural network (RNN) based AVSR is proposed. In particular, the AVSR model consists of three component, which are audio feature extraction mechanism, visual feature extraction mechanism, and audio-visual integration mechanism. The feature extraction mechanisms includes the feature extraction and the feature classification (neural network); whereas the integration mechanism combines the outputs of both modalities. In this thesis, the audio feature mechanism is modelled by Mel-frequency Cepstrum Coefficient (MFCC) and Recurrent Neural Network (RNN), whereas the visual feature mechanism is modelled by Haar-Cascade Detection with OpenCV and RNN. Both of these features are further integrated by RNN. The performance in terms of the speech recognition rate or accuracy and the robustness of the proposed RNN based AVSR were evaluated with clean speech and super superimposed with noise range from 30 dB to -20 dB with 5 dB interval. The final results is 89% of speech recognition rate on average across different level of SNR.

Item Type: Final Year Project
Subjects: Technology > Mechanical engineering and machinery
Faculties: Faculty of Engineering > Bachelor of Engineering (Honours) Mechatronic
Depositing User: Library Staff
Date Deposited: 07 Feb 2020 09:27
Last Modified: 16 Mar 2022 03:20
URI: https://eprints.tarc.edu.my/id/eprint/13171