Assamese Speech-based Vocabulary Identification System using Convolutional Neural Network

Dutta, Dipankar; Choudhury, Ridip Dev; Barman, Utpal

doi:https://dx.doi.org/10.12785/ijcds/120195

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Volume 12
→
Issue 01
→
View Item

Assamese Speech-based Vocabulary Identification System using Convolutional Neural Network

Dutta, Dipankar; Choudhury, Ridip Dev; Barman, Utpal

DOI: https://dx.doi.org/10.12785/ijcds/120195

ISSN: 2210-142X

Date: 2022-10-31

Abstract:

Though the machine learning techniques were being used in Assamese Language Automatic Speech Recognition (ALASR) system over the last five years, but the applications of Convolutional Neural Network (CNN) are very limited in ALASR. The present study introduces a Convolutional Neural Network (CNN) enabled ALASR system for the Assamese language by collecting 35 isolated words in five different prime emotions as Normal, Angry, Happy, Sad, and Fear from five native male and five native female speakers. During the experiment, the Mel Frequency Cepstral Coefficient (MFCCs), Spectral Centroid (SC), zero-crossing rate (ZCR), Chroma Frequencies (CF), spectral roll-off (SRO), and intensity are extracted and analyzed using CNN with convolution layers and max-pooling layers. To examine the consequences, other model such as Feed Forward Artificial Neural Network (FFANN) is likewise applied in ALASR. The evaluating results of CNN with an accuracy of 98.4 % outperformed the ANN accuracy of 86.4 %.

Show full item record