Phishing Website Classification using Machine Learning with  Different Datasets

BOUIJIJ, Habiba; BERQIA, Amine

doi:http://dx.doi.org/10.12785/ijcds/1501115

Journals About us Ethics and Policies Objectives Values Contact us

UOB Journals
→
02. International Journal of Computing and Digital Systems
→
Volume 15
→
Issue 01
→
View Item

dc.contributor.author	BOUIJIJ, Habiba
dc.contributor.author	BERQIA, Amine
dc.date.accessioned	2024-01-07T21:04:14Z
dc.date.available	2024-01-07T21:04:14Z
dc.date.issued	2024-05-01
dc.identifier.issn	2210-142X
dc.identifier.uri	https://journal.uob.edu.bh:443/handle/123456789/5302
dc.description.abstract	The classification of phishing websites through the analysis of their URLs is a technique used to enhance the capabilities of systems designed to detect malicious websites. However, the evolution of phishing sites has allowed them to achieve higher levels of sophistication, making proactive detection more complex. The central focus of this article revolves around the exploitation of deep learning models and machine learning techniques with lexical analysis of their URLs to facilitate the classification, detection, and preventive mitigation of phishing websites. Our study includes the evaluation of a selection of commonly castoff machine learning algorithms, specifically Random Forest, K-Nearest Neighbors, Support Vector Machines, Gradient Boosting, Decision Tree, Bagging, AdaBoost and ExtraTree, as well as the deep neural network model. To assess the effectiveness of these algorithms and models, we conduct our analysis using two distinct URL datasets, one from 2016 and the other from 2021. Through lexical analysis, we extract significant features from the URLs and then calculate the accuracy of each algorithm on both datasets. Our results reveal that some algorithms achieve remarkable accuracy scores of up to 99% when applied to the 2016 dataset. However, this score decreases to less than 91% when applied to the dataset collected in 2021.	en_US
dc.language.iso	en	en_US
dc.publisher	University of Bahrain	en_US
dc.subject	Phishing, URL, Classification, Machine Learning, Deep Learning, Dataset, Accuracy metric	en_US
dc.title	Phishing Website Classification using Machine Learning with Different Datasets	en_US
dc.identifier.doi	http://dx.doi.org/10.12785/ijcds/1501115
dc.volume	15	en_US
dc.issue	1	en_US
dc.pagestart	1627	en_US
dc.pageend	1636	en_US
dc.contributor.authorcountry	Rabat, Morocco	en_US
dc.contributor.authorcountry	Rabat, Morocco	en_US
dc.contributor.authoraffiliation	SSL Lab, ENSIAS Mohammed V University in Rabat	en_US
dc.contributor.authoraffiliation	SSL Lab, ENSIAS Mohammed V University in Rabat	en_US
dc.source.title	International Journal of Computing and Digital Systems	en_US
dc.abbreviatedsourcetitle	IJCDS	en_US