University of Bahrain
Scientific Journals

A Comparative Analysis of Machine Learning and Deep Learning Models for Gender Classification from Audio Speech

Show simple item record

dc.contributor.author Islam, Mainul
dc.contributor.author Ali, Md Nawab Yousuf
dc.date.accessioned 2024-08-24T22:12:59Z
dc.date.available 2024-08-24T22:12:59Z
dc.date.issued 2024-08-25
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/5859
dc.description.abstract Recognizing gender from audio speech can improve human interactions with technologies. Though the human ear can identify the gender of a person from the sound of their voice, it can be quite complicated for an artificial intelligence (AI) system. Effective classification of gender from audio speech depends not only on the effective representation of the audio signal but also on the implementation of robust algorithms. In this research, we utilized Mel Frequency Cepstral Coefficients (MFCC) to represent the audio samples due to their effectiveness in capturing spectral characteristics, which highlight differences in the vocal tract structures between males and females. MFCC is considered one of the most efficient representations of audio signals that mimic human auditory systems. This study aims to analyze the effectiveness of various machine learning (ML) and deep learning (DL) methods in classifying gender from audio speech utilizing the MFCC feature representation. We experimented with several algorithms: SVM, KNN, stacking ensemble method, and LSTM. Three audio speech datasets were utilized to assess the performance of these algorithms. The best accuracies achieved in these datasets are 93.889\%, 99.371\%, and 94.558\%. Furthermore, based on the findings of the experiment, this study proposes a framework for effective gender classification from audio speech. en_US
dc.publisher University of Bahrain en_US
dc.subject Gender classification; SVM; KNN; LSTM; Ensemble method en_US
dc.title A Comparative Analysis of Machine Learning and Deep Learning Models for Gender Classification from Audio Speech en_US
dc.identifier.doi xxxxxx
dc.volume 16 en_US
dc.issue 1 en_US
dc.pagestart 1 en_US
dc.pageend 11 en_US
dc.contributor.authorcountry Bangladesh en_US
dc.contributor.authorcountry Bangladesh en_US
dc.contributor.authoraffiliation East West University en_US
dc.contributor.authoraffiliation East West University Dhaka en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

This item appears in the following Issue(s)

Show simple item record

All Journals


Advanced Search

Browse

Administrator Account