Image Processing and Machine Learning Based Algorithms for Mammogram Analysis



Lim, Yu-Quan (2018) Image Processing and Machine Learning Based Algorithms for Mammogram Analysis. Final Year Project (Bachelor), Tunku Abdul Rahman University College.

[img] Text
Restricted to Registered users only

Download (2MB)


Mammography is a medical imaging technique that uses low-dose X-ray to create visual representations of the interior of the human breast. A mammogram aids in the early detection and diagnosis of breast cancer. With the advancement of technology, computer-aided detection (CAD) systems are playing a supporting role for mammogram analysis. CAD systems are used to search for abnormalities in digitized mammographic images, such as mass and calcifications that may indicate the presence of cancer, and notify the radiologists to further investigate into the regions with abnormalities. CAD systems are still relatively new technology, it is still an ongoing effort to create CAD systems with higher reliability. This research sets to develop algorithms for a mammogram analysis system, the three objectives to be achieved are mammogram pre-processing, classification of breast tissue and classification of normal and abnormal breast regions. The algorithm developed in this research are written in the Python programming language. The algorithms are all developed and tested on the mini-Mammographic Image Analysis Society digital mammogram database. Noise removal, pectoral muscle removal and determining left or right oriented breast are the scope of mammogram pre-processing in this research. Morphological operations and other image processing techniques are used for noise removal. Two algorithms are proposed for pectoral muscle removal, one based on Yen’s thresholding algorithm and the other one based on k-means clustering. The k-means clustering algorithm outperform the thresholding based algorithm, but there are still room for improvements. Support Vector Machines is a machine learning model that works well with linearly separable data, it is used to determine left and right oriented breast in mammograms and a 100% accuracy is achieved. Two classifiers are used for the classification of breast tissue, the first one classifies fatty and glandular tissues with an accuracy of 92.31% based on the image histogram of the mammograms, the second classifies fatty-glandular and dense-glandular tissue with an accuracy of 68.18% based on Grey Level Co-Occurrence Matrix features. Breast regions are cropped from the mammograms to extract textural features and statistical features to train a machine learning model to classify normal and abnormal regions of a breast. The classifier is able to achieve an accuracy of 90.77%. The results are compared to past works to justify the findings of this research. In future works, larger mammogram database will be studied, deep learning and advance computer vision techniques will be explored. It is hoped that this research contributes to fields of mammography, computer vision, machine learning, deep learning and engineering.

Item Type: Final Year Project
Subjects: Technology > Mechanical engineering and machinery
Technology > Electrical engineering. Electronics engineering
Faculties: Faculty of Engineering and Technology > Bachelor of Engineering (Honours) Mechatronic
Depositing User: Library Staff
Date Deposited: 10 Oct 2018 08:04
Last Modified: 12 Apr 2022 08:53