Outsmarting Phishers: A Comparative Analysis of Machine Learning Techniques
PDF

Keywords

Phishing Detection
Machine Learning
Website security
URL
Random Forest
XGBoost
Artificial Neural Network (ANN)
Recurrent Neural Network (RNN)
Classification Algorithm
Kaggle Datasets
Website securityCybersecurity

How to Cite

Outsmarting Phishers: A Comparative Analysis of Machine Learning Techniques. (2025). Journal of Information Technology and Digital World, 6(4), 347-361. https://doi.org/10.36548/jitdw.2024.4.003

Abstract

Phishing attacks threaten the security of the internet by stealing confidential data and money as well. As a way to prevent phishing, an extensive comparative study of the top most machine learning methods for phishing site detection was carried out. This research analyses the performance of ANN, RNN, XGBoost and Random Forest algorithms in the identification of phishing websites using the Kaggle dataset. These algorithms were selected due to their ability to uncover intricate associations and patterns from website information. The review examines the advantages and disadvantages each algorithm presents and compares them to each other based on accuracy efficient, precision, recall, F1 score, and computing efficiency. Through the comparison of these algorithms, the most effective algorithm for phishing detection is revealed, which can be useful to scholars and experts who focus on the improvement of on-line security. The research helps deposit the foundations for attacks prevention and facilitates the protection of online sensitive information. This study shows the effectiveness of using machine learning in the field of cybersecurity, especially with focus on the algorithms and how they can be optimized.

PDF

References

Ian Fette, Norman Sadeh and Anthony Tomasic, "Learning to detect phishing websites", Proceedings of the 16th international conference on World Wide Web, 2007. 649-656

Ding, X.; Liu, B.; Jiang, Z.; Wang, Q.; Xin, L. “Spear Phishing Emails Detection Based on Machine Learning” in Proceedings of the 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Dalian, China, 5–7 May 2021; 354–359.

Gascon, Hugo, Steffen Ullrich, Benjamin Stritter, and Konrad Rieck. "Reading between the lines: content-agnostic detection of spear-phishing emails." In Research in Attacks, Intrusions, and Defenses: 21st International Symposium, RAID 2018, Heraklion, Crete, Greece, September 10-12, 2018, Proceedings 21, Springer International Publishing, 2018. 69-91.

Chandrasekaran, Madhusudhanan, Krishnan Narayanan, and Shambhu Upadhyaya. "Phishing email detection based on structural properties." In NYS cyber security conference, vol. 3, 2006. 2-8.

Ahmed, Abdulghani Ali, and Nurul Amirah Abdullah. "Real time detection of phishing websites." In 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), IEEE, 2016. 1-6.

Rathod, Sunil B., and Tareek M. Pattewar. "Content based spam detection in email using Bayesian classifier." In 2015 International Conference on Communications and Signal Processing (ICCSP), IEEE, 2015. 1257-1261

Daeef, Ammar Yahya, R. Badlishah Ahmad, Yasmin Yacob, Naimah Yaakob, Mohd Warip, And Mohd Nazri Bin. "Phishing Email Classifiers Evaluation: Email Body And Header Approach." Journal of Theoretical & Applied Information Technology 80, no. 2 (2015).

Dewis, Molly, and Thiago Viana. "Phish responder: A hybrid machine learning approach to detect phishing and spam emails." Applied System Innovation 5, no. 4 (2022): 73.

Dhanaraj, S., and V. Karthikeyani. "A study on e-mail image spam filtering techniques." In 2013 international conference on pattern recognition, informatics and mobile engineering, Salem. IEEE, 2013. 49-55

M. Khonji, Y. Iraqi and A. Jones, "Phishing detection: A literature survey", IEEE Communications Surveys Tutorials, vol. 15, no. 4, 2013. 2091-2121

Giri KJ, Parah SA, Bashir R, Muhammad K. 2021. “An efficient approach for phishing detection using machine learning” in Giri KJ, Parah SA, Bashir R, Muhammad K, eds. Multimedia security. Algorithms for intelligent systems. Singapore: Springer. 239-253.

Sahingoz, Ozgur Koray, Ebubekir Buber, Onder Demir, and Banu Diri. "Machine learning based phishing detection from URLs." Expert Systems with Applications 117 (2019): 345-357.

A.K. Jain and B.B. Gupta, "Towards detection of phishing websites on client-side using machine learning based approach", Telecommunication Systems, vol. 68, no. 4,2018. 687-700

N. Abdelhamid, A. Ayesh and F. Thabtah, "Phishing detection based Associative Classification data mining", Expert Systems with Applications, vol. 41, no. 13, 2014.5948-5959

Androutsopoulos, Ion, John Koutsias, Konstantinos V. Chandrinos, George Paliouras, and Constantine D. Spyropoulos. "An evaluation of naive bayesian anti-spam filtering." arXiv preprint cs/0006013 (2000).

Wu, Chih-Hung. "Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks." Expert systems with Applications 36, no. 3 (2009): 4321-4330.