Loh, Ai Lin (2023) Techniques for Improving Quality of NLU: Spelling Variation, Typographical Error and Abbreviation. Final Year Project (Bachelor), Tunku Abdul Rahman University of Management and Technology.
Text
RSW_Loh Ai Lin_Fulltext.pdf Restricted to Registered users only Download (2MB) |
Abstract
The majority of errors in written text involve spelling mistakes. Due to their widespread use, spell checkers are a necessary component of many applications, including messaging services, productivity and collaboration tools, and search engines. In this research, the project comprises 5 different algorithms and 1 deep learning-based model and benchmarks them on naturally occurring misspellings from multiple sources. We discover that many systems do not effectively use the context surrounding the misspelled token. Many free off-the-shelf correctors, including TextBlob, Pyspellchecker, and Symspell, do not utilize the context of the misspelled word in an effective way. To remedy this, we developed and trained a neural network on a specific training dataset that is related to the medical field. Besides that, 5 free off-the-shelf correctors including TextBlob, Pyspellchecker, Symspell, Happy Transformer and Fast Punctuation have been used to correct spelling errors. As a result, the deep learning-based model has a significant robust performance in correcting the spelling errors.
Item Type: | Final Year Project |
---|---|
Subjects: | Science > Computer Science > Computer software |
Faculties: | Faculty of Computing and Information Technology > Bachelor of Computer Science (Honours) in Software Engineering |
Depositing User: | Library Staff |
Date Deposited: | 22 Aug 2023 07:02 |
Last Modified: | 22 Aug 2023 07:02 |
URI: | https://eprints.tarc.edu.my/id/eprint/26092 |