Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
Vol 9 No 25 (2025): Volume 9, Special Issue of INSPIRE’25 April 2025 | Pages: 134-140
International Research Journal of Innovations in Engineering and Technology
OPEN ACCESS | Research Article | Published Date: 24-04-2025
The rapid growth of social media platforms such as YouTube, Facebook, Twitter, and TikTok has revolutionized communication but has also led to an increase in spam and harmful content. Detecting spam comments automatically is crucial to maintaining a safe and engaging digital environment. This study proposes a spam detection model using Natural Language Processing (NLP) and XGBoost, a powerful machine learning algorithm known for its high efficiency and predictive accuracy. The model is trained on a dataset containing YouTube comments and utilizes text preprocessing techniques such as tokenization, stopword removal, and lemmatization to enhance detection accuracy. Compared to traditional classifiers like Naïve Bayes and Linear SVM, the proposed NLP-XGBoost model achieves 94% accuracy in classifying spam and non-spam comments. The results demonstrate the potential of machine learning in improving content moderation and safeguarding online interactions.
Spam Detection, NLP, XGBoost, Social Media, Text Classification, Machine Learning, Content Moderation, YouTube Comments
B. Dhashvanth Sai, E. Bhargavi, G. Srija Naidu, G. Aditya Srinivas, & A. Vimal Kumar. (2025). Automated Spam Detection in YouTube Comments: A Natural Language Processing and Gradient Boosting Approach. In proceeding of International Conference on Sustainable Practices and Innovations in Research and Engineering (INSPIRE'25), published by IRJIET, Volume 9, Special Issue of INSPIRE’25, pp 134-140. Article DOI https://doi.org/10.47001/IRJIET/2025.INSPIRE22
This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence