University of Bahrain
Scientific Journals

Nemlar Corpus improvement for Arabic Natural Language Processing

Show simple item record

dc.contributor.author Kadim, Ayoub
dc.date.accessioned 2024-06-30T17:08:26Z
dc.date.available 2024-06-30T17:08:26Z
dc.date.issued 2024-06-30
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/5784
dc.description.abstract Most machine learning approaches in Natural Language Processing rely mainly on corpora. Indeed, various applications based on this approaches require prior learning of statistical models, including the Hidden Markov Model for Part Of Speech Tagging. However, this learning resources must meet some criteria to have a well trained model, and thus more accurate results. On the other hand, we find that the Arabic language - despite its vast use on the internet and in social media - has a limited number of linguistic resources for machine learning, especially corpora with morpho- syntactic annotations. Thus, in this article we will treat the Nemlar corpus, one of the richest annotated linguistic corpora for the Arabic language. We will first present the content of this corpus. We will then define some criteria in order to improve its structure and enrich its content. We will also present the different modifications made on the original version, including merging POS tags, separating prefixes and suffixes, creating tags for specific cases, etc. in order to lead to the desired form. Then, we will see the experimentation evaluating the new word recognition rate. At the end, we will talk about the advantages and disadvantages of the resulting version. en_US
dc.language.iso en en_US
dc.publisher University of Bahrain en_US
dc.subject Corpus en_US
dc.subject Nemlar en_US
dc.subject Part Of Speech Tagging en_US
dc.subject Arabic language en_US
dc.title Nemlar Corpus improvement for Arabic Natural Language Processing en_US
dc.identifier.doi XXXXXX
dc.volume 17 en_US
dc.issue 1 en_US
dc.pagestart 1 en_US
dc.pageend 13 en_US
dc.contributor.authorcountry Morocco en_US
dc.contributor.authoraffiliation Ibn Zohr University, en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

This item appears in the following Issue(s)

Show simple item record

All Journals


Advanced Search

Browse

Administrator Account