Comparative Analysis of Deep Learning Algorithms for Phishing Email Detection

Abstract

Using skewed sequential data, the study explores the effectiveness of numerous sequential models designed for binary classification tasks. The dataset under investigation consists of 5,595 testing samples and 13,055 training samples, a structure that presents significant difficulties because of uneven labelling. The researchers carefully go through pretreatment procedures, which include text data encoding and effective methods for handling missing information, in order to address this. The study employs and examines a wide range of algorithms, which reflects the heterogeneous sequential modelling environment. A variety of neural network architectures are included in the arsenal: CNN, CNN-RNN, RCNN. The binary classification job at hand is used to thoroughly assess each architecture, revealing both its advantages and disadvantages. The study's evaluation approach, which presents a wide range of measures indicating consistently excellent performance overall, is its key component. Among these algorithms stand out as the best with an astounding 97% accuracy rate on a variety of evaluation metrics. This strong performance highlights their ability to handle sequential data with unbalanced labels and establishes a standard for further work in related fields. Beyond its empirical results, the study is important because it provides a well-designed assessment approach that may be used as a benchmark by practitioners facing similar problems. Through the clarification of important concepts related to model selection and performance evaluation, the study provides professionals and academics with crucial resources to efficiently traverse the complex terrain of sequential modelling.

Country : Iraq

1 Raweia S Mohamed Ali2 Razn A. Abduhameed

  1. Computer Technology Engineering Department, Northern Technical University, Mosul- Iraq
  2. Computer Technology Engineering Department, Northern Technical University, Mosul- Iraq

IRJIET, Volume 8, Issue 7, July 2024 pp. 53-61

doi.org/10.47001/IRJIET/2024.807005

References

  1. K. Cabaj, D. Domingos, Z. Kotulski, and A. Respício, “Cybersecurity education: Evolution of the discipline and analysis of master programs,” Comput. Secur., vol. 75, pp. 24–35, 2018, doi: 10.1016/j.cose.2018.01.015.
  2. C. Iwendi et al., “KeySplitWatermark: Zero Watermarking Algorithm for Software Protection against Cyber-Attacks,” IEEE Access, vol. 8, pp. 72650–72660, 2020, doi: 10.1109/ACCESS.2020.2988160.
  3. A.R. Javed, M. Usman, S. U. Rehman, M. U. Khan, and M. S. Haghighi, “Anomaly Detection in Automated Vehicles Using Multistage Attention-Based Convolutional Neural Network,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 7, pp. 4291–4300, 2021, doi: 10.1109/TITS.2020.3025875.
  4. M. Mittal, C. Iwendi, S. Khan, and A. R. Javed, “Analysis of security and energy efficiency for shortest route discovery in low‐energy adaptive clustering hierarchy protocol using Levenberg‐Marquardt neural network and gated recurrent unit for intrusion detection system,” Trans. Emerg. Telecommun. Technol., vol. 32, 2020, [Online]. Available: https://api.semanticscholar.org/CorpusID:219918712
  5. G. Aaron, “Phishing Activity Trends Report 2nd Quarter,” Anti-Phishing Work. Gr., no. September, pp. 1–12, 2019, [Online]. Available: https://apwg.org/trendsreports/
  6. V. Zeng, S. Baki, A. El Aassal, R. Verma, L. F. T. De Moraes, and A. Das, “Diverse datasets and a customizable benchmarking framework for phishing,” IWSPA 2020 - Proc. 6th Int. Work. Secur. Priv. Anal., no. Section 3, pp. 35–41, 2020, doi: 10.1145/3375708.3380313.
  7. A.Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, and K. Kifayat, “A comprehensive survey of AI-enabled phishing attacks detection techniques,” Telecommun. Syst., vol. 76, no. 1, pp. 139–154, 2021, doi: 10.1007/s11235-020-00733-2.
  8. N. Moradpoor, B. Clavie, and B. Buchanan, “Employing machine learning techniques for detection and classification of phishing emails,” Proc. Comput. Conf. 2017, vol. 2018-Janua, no. July, pp. 149–156, 2018, doi: 10.1109/SAI.2017.8252096.
  9. C. S. Jalda, A. Kumar Nanda, and R. Pitchai, “Spoofing E-Mail Detection Using Stacking Algorithm,” in 2022 8th International Conference on Smart Structures and Systems (ICSSS), 2022, pp. 1–4. doi: 10.1109/ICSSS54381.2022.9782173.
  10. H. Abroshan, J. Devos, G. Poels, and E. Laermans, “Phishing Happens beyond Technology: The Effects of Human Behaviors and Demographics on Each Step of a Phishing Process,” IEEE Access, vol. 9, pp. 44928–44949, 2021, doi: 10.1109/ACCESS.2021.3066383.
  11. N. Q. Do, A. Selamat, O. Krejcar, E. Herrera-Viedma, and H. Fujita, “Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions,” IEEE Access, vol. 10, pp. 36429–36463, 2022, doi: 10.1109/ACCESS.2022.3151903.
  12. S. Bagui, D. Nandi, S. Bagui, and R. J. White, “Machine Learning and Deep Learning for Phishing Email Classification using One-Hot Encoding,” J. Comput. Sci., vol. 17, no. 7, pp. 610–623, 2021, doi: 10.3844/jcssp.2021.610.623.
  13. M. Dewis and T. Viana, “Phish Responder: A Hybrid Machine Learning Approach to Detect Phishing and Spam Emails,” Appl. Syst. Innov., vol. 5, no. 4, pp. 0–1, 2022, doi: 10.3390/asi5040073.
  14. A.Mughaid, S. AlZu’bi, A. Hnaif, S. Taamneh, A. Alnajjar, and E. A. Elsoud, “An intelligent cyber security phishing detection system using deep learning techniques,” Cluster Comput., vol. 25, no. 6, pp. 3819–3828, 2022, doi: 10.1007/s10586-022-03604-4.
  15. U. A. Butt, R. Amin, H. Aldabbas, S. Mohan, B. Alouffi, and A. Ahmadian, “Cloud-based email phishing attack using machine and deep learning algorithm,” Complex Intell. Syst., vol. 9, no. 3, pp. 3043–3070, 2023, doi: 10.1007/s40747-022-00760-3.
  16. Q. Qi, Z. Wang, Y. Xu, Y. Fang, and C. Wang, “Enhancing Phishing Email Detection through Ensemble Learning and Undersampling,” Appl. Sci., vol. 13, no. 15, 2023, doi: 10.3390/app13158756.
  17. Y. S. Murti and P. Naveen, “Machine Learning Algorithms for Phishing Email Detection,” J. Logist. Informatics Serv. Sci., vol. 10, no. 2, pp. 249–261, 2023, doi: 10.33168/JLISS.2023.0217.
  18. M. J. Keelan Evans, Alsharif Abuadbba, Tingmin Wu, Kristen Moore, Mohiuddin Ahmed, Ganna Pogrebna, Surya Nepal, “RAIDER: Reinforcement-aided Spear Phishing Detector,” arXiv:2105.07582v3, no. 1, pp. 1–17, 2023.
  19. M. Dewis and T. Viana, “Phish Responder: A Hybrid Machine Learning Approach to Detect Phishing and Spam Emails,” Appl. Syst. Innov., vol. 5, no. 4, 2022, doi: 10.3390/asi5040073.
  20. A.Alhogail and A. Alsabih, “Applying machine learning and natural language processing to detect phishing email,” Comput. Secur., vol. 110, p. 102414, 2021, doi: https://doi.org/10.1016/j.cose.2021.102414.
  21. S. A. A. Ghaleb, M. Mohamad, S. A. Fadzli, and W. A. H. M. Ghanem, “Training Neural Networks by Enhance Grasshopper Optimization Algorithm for Spam Detection System,” IEEE Access, vol. 9, pp. 116768–116813, 2021, doi: 10.1109/ACCESS.2021.3105914.
  22. R. Eckhardt and S. Bagui, “Convolutional Neural Networks and Long Short Term Memory for Phishing Email Classification,” Int. J. if Comput. Sci. Inf. Secur., vol. 19, no. 5, pp. 27–35, 2021.
  23. Ž. Vujović, “Classification Model Evaluation Metrics,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 6, pp. 599–606, 2021, doi: 10.14569/IJACSA.2021.0120670.