User Behaviour Analysis Using Machine Learning

N.RanognaDepartment of Computer Science, Mahatma Gandhi Institute of Technology, Hyderabad, IndiaN.SanjanaDepartment of Computer Science, Mahatma Gandhi Institute of Technology, Hyderabad, IndiaK.MadhubabuAssistant Professor, Department of Computer Science, Mahatma Gandhi Institute of Technology, Hyderabad, IndiaB.ChandrashekarAssistant Professor, Department of Computer Science, Mahatma Gandhi Institute of Technology, Hyderabad, India

Vol 10 No 5 (2026): Volume 10, Issue 5, May 2026 | Pages: 610-617

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 29-05-2026

doi Logo doi.org/10.47001/IRJIET/2026.105082

Abstract

This project presents a machine learning-based system for analyzing e-commerce user behaviour and predicting purchase intention. The dataset consists of 49,999 interaction events across 12,438 sessions and 10,537 users, with a purchase rate of 3.92%, indicating severe class imbalance. Four models - Logistic Regression, Decision Tree, Random Forest, and Neural Network were trained and evaluated. Logistic Regression achieved the best performance with an F1-score of 97.51% and an AUC of 99.98%. The system includes a Flask-based web application that provides real-time predictions along with business insights such as risk segmentation and recommended actions.

Keywords

E-commerce Analytics, User Behaviour Analysis, Purchase Intention Prediction, Machine Learning, Logistic Regression, Random Forest, Neural Network, Decision Tree, Predictive Analytics, Customer Segmentation, Class Imbalance, Web Application.


Citation of this Article

N.Ranogna, N.Sanjana, K.Madhubabu, & B.Chandrashekar. (2026). Design User Behaviour Analysis Using Machine Learning. International Research Journal of Innovations in Engineering and Technology - IRJIET, 10(5), 610-617. Article DOI https://doi.org/10.47001/IRJIET/2026.105082

References
Kohavi, R. (1996). Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD).

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.

Pedregosa, F., et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://scikit-learn.org/

McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference. https://pandas.pydata.org/

Harris, C. R., et al. (2020). Array Programming with NumPy. Nature, 585, 357–362. https://numpy.org/

Flask Documentation. (2023). Flask: Web Development, One Drop at a Time. Pallets Projects. https://flask.palletsprojects.com/

Joblib Development Team. (2023). Joblib: Running Python Functions as Pipeline Jobs. https://joblib.readthedocs.io/

Sakar, C. O., et al. (2019). Real-Time Prediction of Online Shoppers' Purchasing Intention Using Multilayer Perceptron and LSTM Recurrent Neural Networks. Neural Computing and Applications, 31, 6893–6908.

Hu, R., & Pu, P. (2011). Exploring the Effects of Natural Language Explanations on Recommender Systems. Proceedings of the 16th ACM International Conference on Intelligent User Interfaces.

Jiang, Z., et al. (2020). Purchase Intention Prediction Using Session-Based Clickstream Data. Expert Systems with Applications, 145, 113104.

Li, J., et al. (2021). User Purchase Prediction in E- Commerce Using Ensemble Learning. International Journal of Information Management, 59, 102353.

Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly Media.

Chollet, F. (2021). Deep Learning with Python (2nd ed.). Manning Publications.

Provost, F., & Fawcett, T. (2013). Data Science for Business. O'Reilly Media.

Lundberg, S. M., & Lee, S. I. (2017). A Unified Approach to Interpreting Model Predictions (SHAP). Advances in Neural Information Processing Systems (NeurIPS), 30.

World Wide Web Consortium. (2023). HTML5 Specification. https://www.w3.org/TR/html5/

Open Web Application Security Project (OWASP). (2023). REST Security Cheat Sheet. https://cheatsheetseries.owasp.org/

Pandas Development Team. (2023). pandas: Powerful Python Data Analysis Toolkit. https://pandas.pydata.org/docs/

NumPy Development Team. (2023). NumPy Documentation. https://numpy.org/doc/

eCommerce Dataset. (2023). E-Commerce Behaviour Data from Multi-Category Store. Kaggle. https://www.kaggle.com/datasets/mkechinov/ecommer ce-behavior-data-from-multi-category-store

Scikit-learn        Developers.          (2023). sklearn.ensemble. Random Forest Classifier. https://scikit- learn.org/stable/modules/generated/sklearn.ensemble. RandomForestClassifier.html