Word sense disambiguation task for Bodo language using Attention based Deep CNN architecture

Basumatary, Subungshri; Barman, Manas; Kumar Barman, Anup; Nag, Amitava; Brahma, Bihung

doi:XXXXXX

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Preprint
→
View Item

dc.contributor.author	Basumatary, Subungshri
dc.contributor.author	Barman, Manas
dc.contributor.author	Kumar Barman, Anup
dc.contributor.author	Nag, Amitava
dc.contributor.author	Brahma, Bihung
dc.date.accessioned	2024-07-19T11:22:23Z
dc.date.available	2024-07-19T11:22:23Z
dc.date.issued	2024-07-19
dc.identifier.uri	https://journal.uob.edu.bh:443/handle/123456789/5822
dc.description.abstract	Interest in Natural Language Processing (NLP) has grown very quickly over the last decades, mainly because it provides tools to represent and analyze human languages computationally. A key challenge in NLP is word categorization or classification based on its meaning within a given context. This problem is referred to as word-sense disambiguation (WSD). This issue is prevalent in all languages around the world. However, WSD poses the greatest challenge among North-East Indian languages due to the scarcity of digital resources. This work is an attempt to solve the problem of Word Sense Disambiguation in a low-resource Bodo language and is also considered text-sparse using an adapted Convolutional Neural Network (CNN) model with an attention mechanism. The northeastern region of India predominantly speaks the Bodo language, necessitating careful consideration of its data when constructing NLP models. An attention layer has been implemented in order to effectively identify the significant properties associated with a particular label, enabling the model to focus on the more important things. The CNN layer again extracts certain semantic components from sentences, which further helps in catching subtle nuances of meaning. Testing results were promising, as the proposed framework achieved a remarkable accuracy of 71.43% on a very narrow dataset. Therefore, it demonstrates that the deep CNN with soft attention is more effective in inferring the meaning of words in the Bodo language. Hence, the study proves that NLP, using advanced methodologies like the CNN-Attention model, has immense potential to get over these challenges in low-resource languages. By drawing powerful attention mechanisms and convolutional neural networks together, the model is endowed better at capturing fine-grained semantic differences, offering a glimpse into the possibility for better language processing tools in Bodo and other similarly resource-limited languages.	en_US
dc.language.iso	en	en_US
dc.publisher	University of Bahrain	en_US
dc.subject	NLP,	en_US
dc.subject	WSD,	en_US
dc.subject	Deep learning,	en_US
dc.subject	CNN,	en_US
dc.subject	Attention layer	en_US
dc.subject	Bodo language	en_US
dc.title	Word sense disambiguation task for Bodo language using Attention based Deep CNN architecture	en_US
dc.identifier.doi	XXXXXX
dc.volume	17	en_US
dc.issue	1	en_US
dc.pagestart	1	en_US
dc.pageend	10	en_US
dc.contributor.authorcountry	Kokrajhar, India	en_US
dc.contributor.authoraffiliation	Computer Science and Engineering,Central Institute of Technology Kokrajhar	en_US
dc.source.title	International Journal of Computing and Digital Systems	en_US
dc.abbreviatedsourcetitle	IJCDS	en_US