University of Bahrain
Scientific Journals

Word Sense Disambiguation in Hindi Language Using Score Based Modified Lesk Algorithm

Show simple item record Tripathi, Praffullit Mukherjee, Prasenjit Hendre, Manik Godse, Manish Chakraborty, Baisakhi 2020-07-22T09:28:31Z 2020-07-22T09:28:31Z 2020-07-01
dc.identifier.issn 2210-142X
dc.description.abstract Hindi is the widely used spoken language in the Indian subcontinent, and is used by more than 260 million Indians citizens. Indian governments has many digital initiatives to serve Indian citizen better, hence Hindi language becomes one of the important languages to serve Indian citizen. The Government initiatives are like smart city, Hospital Services, Common Service Centers, Digital Payment Ecosystem, Pensioners Scheme, Digital Locker and many more. These all initiative are served using mobile and web based applications, which citizens can access easily instead of visiting various government departments. To serve the large Hindi speaking population, it is necessary to handle the ambiguous words which have multiple connotations in any natural language processing task. In this paper, word sense disambiguation for Hindi language is proposed. Proposed method makes use of Lesk algorithm to disambiguate the Hindi words. Novel scoring method is used to assign a sense score to each token of the Hindi sentence. The sense score is calculated based on the gloss, hypernym, hyponym and synonym of the combinations of different sense of tokens. Hindi WordNet database created by CFILT, IIT Bombay is used in the proposed system. The proposed algorithm takes a natural language (NL) sentence in Hindi (Devanagari script) and process the sentence according to the score based approach modeled on the basic Lesk algorithm with the help of Hindi WordNet designed by CFILT IIT Bombay. The solution provided in this paper can be used vividly in various web based applications like Query-Response Systems, Question-Answer Systems, Sentiment analysis, Recommendation systems etc. en_US
dc.language.iso en en_US
dc.publisher University of Bahrain en_US
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International *
dc.rights.uri *
dc.subject NLP, Lesk Algorithm, word sense disambiguation, multi word WSD, Hindi WordNet. en_US
dc.title Word Sense Disambiguation in Hindi Language Using Score Based Modified Lesk Algorithm en_US
dc.type Article en_US
dc.volume 10 en_US
dc.pagestart 2 en_US
dc.pageend 20 en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US

Files in this item

The following license files are associated with this item:

This item appears in the following Issue(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 International Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International

All Journals

Advanced Search


Administrator Account