University of Bahrain
Scientific Journals

Use Word Cloud Image Of Web Page Text Content On Convolutional Neural Network (CNN) For Classification of Web Pages

Show simple item record

dc.contributor.author Apandi, Siti Hawa
dc.contributor.author Sallim, Jamaludin
dc.contributor.author Mohamed, Rozlina
dc.date.accessioned 2023-05-14T18:08:32Z
dc.date.available 2023-05-14T18:08:32Z
dc.date.issued 2024-01-15
dc.identifier.issn 2210-142X
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/4946
dc.description.abstract In today's environment, people can easily use the internet to find information by visiting web pages. Most people like to visit web pages that offer games and videos to watch online. People who spend a lot of time on web pages like these can become addicted to the internet and it can have a bad effect on them. Access to web pages that offer games and streaming videos needs to be limited to stop people from being addicted to the internet. It needs a tool that can classify web pages category based on its content. Due to lack of matrix representation that unable to handle long web page text content, this study uses a technique which is word cloud image to visualize the words that has been extracted from the text content web page after performing data pre-processing. The most popular words from the text content web page are displayed in big size and appear in center of the word cloud image. The most popular words are the words that frequently appear in the text content web page, and it related to describe what the web page content is about. The Convolutional Neural Network (CNN) identifies the pattern of words displayed in the central areas of the word cloud image to classify the category that the web page belongs to. The proposed model for classifying web pages has an accuracy of 0.86. The proposed model can be used, for example, by the institution to set rules and limit the usage of the internet for the users to surf the web pages that offer games and streaming videos. It will be one of the ways to prevent users from getting internet addiction. en_US
dc.language.iso en en_US
dc.publisher University Of Bahrain en_US
dc.subject Web page classification en_US
dc.subject document representation en_US
dc.subject word cloud image en_US
dc.subject deep learning en_US
dc.subject Convolutional Neural Network en_US
dc.title Use Word Cloud Image Of Web Page Text Content On Convolutional Neural Network (CNN) For Classification of Web Pages en_US
dc.type Article en_US
dc.identifier.doi http://dx.doi.org/10.12785/ijcds/150127
dc.volume 15 en_US
dc.issue 1 en_US
dc.pagestart 347 en_US
dc.pageend 358 en_US
dc.contributor.authorcountry Malaysia en_US
dc.contributor.authoraffiliation Faculty of Computing, Universiti Malaysia Pahang, Pekan, Pahang en_US
dc.source.title International Journal of Computing and Digital Systems en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

This item appears in the following Issue(s)

Show simple item record

All Journals


Advanced Search

Browse

Administrator Account