A Syllable-based Speech Recognition system by using Pitch detection on Time-Frequency domain Feature Extraction

Wiriyarattanakul, Sopon; Kaewfoongrungsi, Piroon; Sumonphan, Ekkalak

doi:http://dx.doi.org/10.12785/ijcds/140192

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Volume 14
→
Issue 01
→
View Item

dc.contributor.author	Wiriyarattanakul, Sopon
dc.contributor.author	Kaewfoongrungsi, Piroon
dc.contributor.author	Sumonphan, Ekkalak
dc.date.accessioned	2023-07-20T08:17:46Z
dc.date.available	2023-07-20T08:17:46Z
dc.date.issued	2023-10-01
dc.identifier.issn	2210-142X
dc.identifier.uri	https://journal.uob.edu.bh:443/handle/123456789/5098
dc.description.abstract	This research presents the segmentation of single-syllable sounds for speech recognition using an artificial neural network. The network combines key features from speech signals in the time and frequency domains. The approach involves dividing speech signals into frames using the short-time energy waveform. Pitch markers are then extracted from the frames and used as reference points to split them into sections. The sections are further analyzed using window searching to identify positions, amplitudes, local minimum and maximum values, and maximum slope values, which serve as key features in the time domain. In the frequency domain, cepstrum coefficients on the Mel scale are used as additional key features. The two types of key features are combined for speech recognition using the artificial neural network. The study also compares the performance of the combined and separated key features in the time and frequency domains when fed into the neural network. The results demonstrate that using the artificial neural network with two input layers (Mel frequency cepstral coefficient and time domain features) and the same hidden layers yields the highest recognition accuracy of 96.97% and 88.43% for blind tests.	en_US
dc.language.iso	en	en_US
dc.publisher	University of Bahrain	en_US
dc.subject	Pitch detection	en_US
dc.subject	Time-Frequency domain	en_US
dc.subject	Feature extraction	en_US
dc.subject	Speech recognition	en_US
dc.subject	Syllable	en_US
dc.subject	Short-time energy waveform	en_US
dc.title	A Syllable-based Speech Recognition system by using Pitch detection on Time-Frequency domain Feature Extraction	en_US
dc.identifier.doi	http://dx.doi.org/10.12785/ijcds/140192
dc.volume	14	en_US
dc.issue	1	en_US
dc.pagestart	10193	en_US
dc.pageend	10203	en_US
dc.contributor.authorcountry	Thailand	en_US
dc.contributor.authoraffiliation	Uttaradit Rajabhat University	en_US
dc.contributor.authoraffiliation	Chiang Mai Rajabhat University	en_US
dc.contributor.authoraffiliation	Rajamangala University of Technology	en_US
dc.source.title	International Journal of Computing and Digital Systems	en_US
dc.abbreviatedsourcetitle	IJCDS	en_US