dc.contributor.author | Joshi, Manju Lata | |
dc.contributor.author | Mittal, Namita | |
dc.contributor.author | Joshi, Nisheeth | |
dc.date.accessioned | 2021-07-14T11:21:50Z | |
dc.date.available | 2021-07-14T11:21:50Z | |
dc.date.issued | 2021-07-14 | |
dc.identifier.issn | 2210-142X | |
dc.identifier.uri | https://journal.uob.edu.bh:443/handle/123456789/4290 | |
dc.description.abstract | Automatic keyword extraction is an automated process to identify terms that best describe the subject of the document. These terms can be in the form of key terms or key phrases representing the most relevant information conveyed by the documents. Keyword extraction techniques can be Statistical based, Linguistic based, Machine Learning based, Graph-based, or Hybrid of any these. Each approach has its limitations and strengths. This paper focuses on Graph-based approaches. These approaches rely on the exploration of network properties like Degree, Structural Diversity Index, Strength, Clustering Coefficient, Neighborhood Size, Page Rank, Closeness, Betweenness, Eigenvector Centrality, Hub, and Authority Score. In the proposed approach, the graph is constructed using semantic linkages between the terms in the document. The semantic linkages between the document terms are extracted using Hindi Wordnet as a background knowledge source. Further, fourteen different graphical measures are applied to extract the keywords. The experiments are conducted on the Tourism and Health data set of the Hindi language. The results of the proposed approach are evaluated and compared with the state-of-the-art approach TextRank as well as with the Human Annotated keywords. The result shows that the closeness centrality measure produces better precision and recall as compared to other graphical measures in case of matching with human-annotated keywords while authority proved as a good graphical measure to produce keywords, matching with TextRank. The experiments prove that the proposed semantic graph-based approach performs better as compared to the state of art approach TextRank. This paper also explored the correlation between different graph-theoretic measures using different methods of correlations. | en_US |
dc.language.iso | en | en_US |
dc.publisher | University of Bahrain | en_US |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Automatic Keyword Extraction | en_US |
dc.subject | Semantic Graph-based Keyword Extraction | en_US |
dc.subject | Semantic Network | en_US |
dc.subject | Hindi Text Documents | en_US |
dc.subject | Hindi WordNet | en_US |
dc.title | SGAKE: Semantic Graph-based Automatic Keyword Extraction from Hindi Text Documents | en_US |
dc.identifier.doi | https://dx.doi.org/10.12785/ijcds/120130 | |
dc.contributor.authorcountry | India | en_US |
dc.contributor.authorcountry | India | en_US |
dc.contributor.authorcountry | India | en_US |
dc.contributor.authoraffiliation | Banasthali University & ISIM | en_US |
dc.contributor.authoraffiliation | MNIT Jaipur | en_US |
dc.contributor.authoraffiliation | Banasthali University | en_US |
dc.source.title | International Journal of Computing and Digital System | en_US |
The following license files are associated with this item: