Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
The rapid
growth of social media platforms such as YouTube, Facebook, Twitter, and TikTok
has revolutionized communication but has also led to an increase in spam and
harmful content. Detecting spam comments automatically is crucial to
maintaining a safe and engaging digital environment. This study proposes a spam
detection model using Natural Language Processing (NLP) and XGBoost, a powerful
machine learning algorithm known for its high efficiency and predictive
accuracy. The model is trained on a dataset containing YouTube comments and
utilizes text preprocessing techniques such as tokenization, stopword removal,
and lemmatization to enhance detection accuracy. Compared to traditional
classifiers like Naïve Bayes and Linear SVM, the proposed NLP-XGBoost model
achieves 94% accuracy in classifying spam and non-spam comments. The results
demonstrate the potential of machine learning in improving content moderation
and safeguarding online interactions.
Country : India
IRJIET, Volume 9, Special Issue of INSPIRE’25 April 2025 pp. 134-140