Abstract
Phishing attacks threaten the security of the internet by stealing confidential data and money as well. As a way to prevent phishing, an extensive comparative study of the top most machine learning methods for phishing site detection was carried out. This research analyses the performance of ANN, RNN, XGBoost and Random Forest algorithms in the identification of phishing websites using the Kaggle dataset. These algorithms were selected due to their ability to uncover intricate associations and patterns from website information. The review examines the advantages and disadvantages each algorithm presents and compares them to each other based on accuracy efficient, precision, recall, F1 score, and computing efficiency. Through the comparison of these algorithms, the most effective algorithm for phishing detection is revealed, which can be useful to scholars and experts who focus on the improvement of on-line security. The research helps deposit the foundations for attacks prevention and facilitates the protection of online sensitive information. This study shows the effectiveness of using machine learning in the field of cybersecurity, especially with focus on the algorithms and how they can be optimized.
References
Ian Fette, Norman Sadeh and Anthony Tomasic, "Learning to detect phishing websites", Proceedings of the 16th international conference on World Wide Web, 2007. 649-656
Ding, X.; Liu, B.; Jiang, Z.; Wang, Q.; Xin, L. “Spear Phishing Emails Detection Based on Machine Learning” in Proceedings of the 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Dalian, China, 5–7 May 2021; 354–359.
Gascon, Hugo, Steffen Ullrich, Benjamin Stritter, and Konrad Rieck. "Reading between the lines: content-agnostic detection of spear-phishing emails." In Research in Attacks, Intrusions, and Defenses: 21st International Symposium, RAID 2018, Heraklion, Crete, Greece, September 10-12, 2018, Proceedings 21, Springer International Publishing, 2018. 69-91.
Chandrasekaran, Madhusudhanan, Krishnan Narayanan, and Shambhu Upadhyaya. "Phishing email detection based on structural properties." In NYS cyber security conference, vol. 3, 2006. 2-8.
Ahmed, Abdulghani Ali, and Nurul Amirah Abdullah. "Real time detection of phishing websites." In 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), IEEE, 2016. 1-6.
Rathod, Sunil B., and Tareek M. Pattewar. "Content based spam detection in email using Bayesian classifier." In 2015 International Conference on Communications and Signal Processing (ICCSP), IEEE, 2015. 1257-1261
Daeef, Ammar Yahya, R. Badlishah Ahmad, Yasmin Yacob, Naimah Yaakob, Mohd Warip, And Mohd Nazri Bin. "Phishing Email Classifiers Evaluation: Email Body And Header Approach." Journal of Theoretical & Applied Information Technology 80, no. 2 (2015).
Dewis, Molly, and Thiago Viana. "Phish responder: A hybrid machine learning approach to detect phishing and spam emails." Applied System Innovation 5, no. 4 (2022): 73.
Dhanaraj, S., and V. Karthikeyani. "A study on e-mail image spam filtering techniques." In 2013 international conference on pattern recognition, informatics and mobile engineering, Salem. IEEE, 2013. 49-55
M. Khonji, Y. Iraqi and A. Jones, "Phishing detection: A literature survey", IEEE Communications Surveys Tutorials, vol. 15, no. 4, 2013. 2091-2121
Giri KJ, Parah SA, Bashir R, Muhammad K. 2021. “An efficient approach for phishing detection using machine learning” in Giri KJ, Parah SA, Bashir R, Muhammad K, eds. Multimedia security. Algorithms for intelligent systems. Singapore: Springer. 239-253.
Sahingoz, Ozgur Koray, Ebubekir Buber, Onder Demir, and Banu Diri. "Machine learning based phishing detection from URLs." Expert Systems with Applications 117 (2019): 345-357.
A.K. Jain and B.B. Gupta, "Towards detection of phishing websites on client-side using machine learning based approach", Telecommunication Systems, vol. 68, no. 4,2018. 687-700
N. Abdelhamid, A. Ayesh and F. Thabtah, "Phishing detection based Associative Classification data mining", Expert Systems with Applications, vol. 41, no. 13, 2014.5948-5959
Androutsopoulos, Ion, John Koutsias, Konstantinos V. Chandrinos, George Paliouras, and Constantine D. Spyropoulos. "An evaluation of naive bayesian anti-spam filtering." arXiv preprint cs/0006013 (2000).
Wu, Chih-Hung. "Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks." Expert systems with Applications 36, no. 3 (2009): 4321-4330.
