University of Bahrain
Scientific Journals

Web Information Extraction methods using Web Content Mining (WCM) for Web Applications

Show simple item record

dc.contributor.author R, Raghavendra
dc.contributor.author M, Dr. Niranjanamurthy
dc.date.accessioned 2021-07-27T06:20:54Z
dc.date.available 2021-07-27T06:20:54Z
dc.date.issued 2021-07-27
dc.identifier.issn 2210-142X
dc.identifier.uri https://journal.uob.edu.bh:443/handle/123456789/4354
dc.description.abstract In the digital world era, data was generated by humans and machines are huge in volume and have been accessed through websites on the internet platform. Most of the transactions happened on web product items, web news, and web advertisements. Web Information Extraction (WIE) is the technique where the information on websites is extracted accurately within a time using Web Content Mining (WCM) concept. Every second, new data has been generated in different locations and the contents of the websites have changed rapidly at various intervals during processing time. The live time and location of the data have changed each time when internet users processing web applications. So extracting the information from the web page or website is a challenging one with accuracy and latency on websites. Classic algorithms and data mining techniques are used to preprocess the generated data with a certain time but the validity of those has not been maintained on the web server. Perhaps, their special features have taken for doing extraction using web mining techniques. The recently advanced concepts such as Deep Learning with Recurrent Neural Networks (RNN) are used to perform Web Information Extraction on various websites over the large network by keeping hold of the data status at each second in memory while doing the processing. The technique Long Short-Term Memory (LSTM) is used to hold the status in intermediate memory then all generated data in web applications send this status to RNN for further classifications. Classification methods are used in Artificial Neural Networks (ANN), it would train the input data from the large network and segregate them based on the algorithms used by the user. Finally, the deep learning concept is combined with all recent trends with input models as an embedded layer. Social media information is up-to-date with its originality and validity also keeps track fully in larger networks by using this technique. This paper suggested the best methods to implement the web information extraction concepts in web content mining from different websites on larger clusters/networks using deep learning LSTM techniques. en_US
dc.language.iso en en_US
dc.publisher University of Bahrain en_US
dc.rights Attribution-NonCommercial-NoDerivatives 4.0 International *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/4.0/ *
dc.subject Web Content Mining (WCM) en_US
dc.subject WIE (Web Information Extraction) en_US
dc.subject RNN (Recurrent Neural Network) en_US
dc.subject LSTM (Long – Short Term Memory) en_US
dc.subject ANN (Artificial neural Network) en_US
dc.subject web server en_US
dc.title Web Information Extraction methods using Web Content Mining (WCM) for Web Applications en_US
dc.identifier.doi https://dx.doi.org/10.12785/ijcds/110149
dc.contributor.authorcountry India en_US
dc.contributor.authorcountry India en_US
dc.contributor.authoraffiliation Ramaiah Institute Of Technology, M S Ramaiah Nagar, Bengaluru, Karnataka & Visvesvaraya Technological University (VTU) Jnana Sangama Machhe, Belgaum, Karnataka en_US
dc.contributor.authoraffiliation Ramaiah Institute Of Technology, M S Ramaiah Nagar, Bengaluru, Karnataka & Visvesvaraya Technological University (VTU) Jnana Sangama Machhe, Belgaum, Karnataka en_US
dc.source.title International Journal of Computing and Digital System en_US
dc.abbreviatedsourcetitle IJCDS en_US


Files in this item

The following license files are associated with this item:

This item appears in the following Issue(s)

Show simple item record

Attribution-NonCommercial-NoDerivatives 4.0 International Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International

All Journals


Advanced Search

Browse

Administrator Account