Machine Learning Driven Football Predictions

Abstract

In this study, we analyzed player performance in 864 Qatar Stars League (QSL) matches (2012-2019) to determine key factors influencing match outcomes. Using a machine learning framework, we classified match results and identified performance metrics that distinguish winning teams from losing ones. Logistic regression emerged as the top model, achieving over 80% accuracy. Key features included opponent analysis, player market value prediction, player profiling, tactical pattern analysis, injury prevention, and team performance metrics. Notably, defenders' roles and fair play significantly impacted match outcomes, and player performance from the last five seasons provided strong predictive power for future matches. Feature Selection: Multiple feature selection methods were used to identify critical performance metrics that contribute to match outcomes, improving the accuracy of the prediction model. Defensive Importance: The analysis highlighted the significant role of defenders, indicating their crucial influence on match results, challenging the common focus on attacking players. Fair Play Impact: Teams that played fair, committing fewer fouls and receiving fewer cards, were more likely to win, showcasing the impact of discipline on success. Historical Data Utility: The model demonstrated that performance data from the last five seasons provides enough predictive power to forecast the winner in upcoming matches. Model Generalization: The machine learning framework showed strong potential to be applied to other leagues and competitions, given its robust predictive accuracy.

Country : India

1 Vivek Patil2 Akash Shetty3 Soham Tonape4 Prof. D.G. Modani

  1. Student, Department of Computer Engineering, PES’s Modern College of Engineering, Pune, Maharashtra, India
  2. Student, Department of Computer Engineering, PES’s Modern College of Engineering, Pune, Maharashtra, India
  3. Student, Department of Computer Engineering, PES’s Modern College of Engineering, Pune, Maharashtra, India
  4. Asst. Professor, Department of Computer Engineering, PES’s Modern College of Engineering, Pune, Maharashtra, India

IRJIET, Volume 9, Issue 5, May 2025 pp. 442-446

doi.org/10.47001/IRJIET/2025.905049

References

  1. G. Baio and M. Blangiardo. "Bayesian hierarchical model for the prediction of football results." University College London Department of Statistical Sciences, Gower Street, London WC1 6BT
  2. J. Hucaljuk and A. Rakipovid. "Predicting football scores using machine learning techniques." University of Zagreb, Faculty of Electrical Engineering and Computing Unska 3, 10000 Zagreb, Croatia
  3. N. Tax and Y. Jousts. "Predicting the Dutch Football Competition Using Public Data: A Machine Learning Approach."
  4. Baboota, Rahul & Kaur, Harleen. (2018). Predictive analysis and modelling football results using machine learning approach for English Premier League. International Journal of Forecasting. 35. 10.1016/j.ijforecast.2018.01.003.
  5. Khazaal, Y., Chatton, A., Billieux, J. Effects of expertise on football betting. Subst Abuse Treat Prev Policy 7, 18 (2012). https://doi.org/10.1186/1747-597X-7-18
  6. Kampakis, Stylianos and Andreas Adamides. “Using Twitter to predict football outcomes.” ArXiv abs/1411.1243 (2014): n. pag.
  7. Peng, Joanne & Lee, Kuk & Ingersoll, Gary. (2002). An Introduction to Logistic Regression Analysis and Reporting. Journal of Educational Research - J EDUC RES. 96. 3-14. 10.1080/00220670209598786.
  8. D. Prasetio and D. Harlili, "Predicting football match results with logistic regression," 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA), George Town, 2016, pp. 1-5.
  9. N. Ancona, G. Cicirelli, A. Branca and A. Distante, "Goal detection in football by using support vector machines for classification," IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222), Washington, DC, USA, 2001, pp. 611-616 vol.1.
  10. Yang, Feng-Jen. (2018). An Implementation of Naive Bayes Classifier. 301-306. 10.1109/CSCI46756.2018.00065.