AI-Based Machine Learning System for State-Level SDG Performance Forecasting and Risk Classification in India

Pratap PatilStudent, Department of Computer Science and Engineering (Artificial Intelligence & Machine Learning), D Y Patil International University, Akurdi, Pune-411044, Maharashtra, IndiaSamarth KulkarniStudent, Department of Computer Science and Engineering (Artificial Intelligence & Machine Learning), D Y Patil International University, Akurdi, Pune-411044, Maharashtra, IndiaParth DeshpandeStudent, Department of Computer Science and Engineering (Artificial Intelligence & Machine Learning), D Y Patil International University, Akurdi, Pune-411044, Maharashtra, IndiaGaurav Kumar SinghAssistant Professor, Department of Computer Science and Engineering (Artificial Intelligence & Machine Learning), D Y Patil International University, Akurdi, Pune-411044, Maharashtra, India

Vol 10 No 5 (2026): Volume 10, Issue 5, May 2026 | Pages: 381-389

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 19-05-2026

doi Logo doi.org/10.47001/IRJIET/2026.105051

Abstract

India’s official Sustainable Development Goal (SDG) India Index has been released yearly by NITI Aayog since 2018, creating a valuable time-series dataset of sustainability metrics at India’s state level. However, until now it has been used almost exclusively for analysis of past performance. Policymakers and administrators at the state level currently have no way to foresee which states are at risk of falling behind in their development, nor do they have the ability to quantitatively identify where their governments should focus resources to head-off these risks before they become critical. This paper introduces SDG Forecast Dashboard, a fully automated AI-driven forecasting and risk classification pipeline that estimates state-level scores on three chosen SDGs – SDG 3: Good Health and Well-being, SDG 4: Quality Education, and SDG 13: Climate Action – for the next three years (2024-2026). Our models are trained per state using linear regression on 6 years of official NITI Aayog index data from 2018-2023, perform NITI Aayog’s own risk categorization using their threshold-based system (Low / Medium / High Risk), and are deployed to a user-friendly Streamlit public dashboard targeted at ease-of-use for policymakers and other non-technical audiences. Results on a held-out testing period of 2022-2023 show that linear regression consistently beats a last-value prediction baseline across RMSE and MAE for all 3 SDGs forecasted, with average RMSE of 3.8, 4.1, and 4.6 and R² of 0.74, 0.71, and 0.68 for SDGs 3, 4, and 13 respectively. To our knowledge, this is the first publicly available sub-national SDG forecasting system for India using officially published NITI Aayog data with longitudinal view combined with a public-facing dashboard.

Keywords

Sustainable Development Goals, SDG India Index, machine learning forecasting, linear regression, NITI Aayog, risk classification, Streamlit dashboard, state-level analysis, policy analytics, India.


Citation of this Article

Pratap Patil, Samarth Kulkarni, Parth Deshpande, & Gaurav Kumar Singh. (2026). AI-Based Machine Learning System for State-Level SDG Performance Forecasting and Risk Classification in India. International Research Journal of Innovations in Engineering and Technology - IRJIET, 10(5), 381-389. Article DOI https://doi.org/10.47001/IRJIET/2026.105051

References
United Nations. (2015). Transforming our world: The 2030 agenda for sustainable development. United Nations. https://sdgs.un.org/2030agenda

NITI Aayog. (2024). SDG India Index 2023–24. Government of India. https://www.niti.gov.in/sites/default/files/2024-07/SDG_India_Index_2023-24.pdf

Vinuesa, R., Azizpour, H., Leite, I., Balaam, M., Dignum, V., Domisch, S., Felländer, A., Langhans, S. D., Tegmark, M., & Fuso Nerini, F. (2020). The role of artificial intelligence in achieving the Sustainable Development Goals. Nature Communications, 11(1), 233. https://doi.org/10.1038/s41467-019-14108-y

Pradhan, P., Costa, L., Rybski, D., Lucht, W., & Kropp, J. P. (2017). A systematic study of Sustainable Development Goal (SDG) interactions. Earth’s Future, 5(11), 1169–1179. https://doi.org/10.1002/2017EF000632

Chenary, N., Golkarian, A., & Naghizadeh, M. (2024). Forecasting Sustainable Development Goal scores by 2030 using linear regression and ARIMAX models. Sustainable Development, 32(4), 3831–3845. https://doi.org/10.1002/sd.3037

Khan, M. R., Islam, M. S., & Rahman, M. (2025). Forecasting SDG 3 indicators using machine learning and ARIMA models: Evidence from Bangladesh. Healthcare, 13(4), 418. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11878945/

Fuso Nerini, F., Sovacool, B., Hughes, N., Cozzi, L., Cosgrave, E., Howells, M., Tavoni, M., Tomei, J., Zerriffi, H., & Milligan, B. (2019). Connecting climate action with other Sustainable Development Goals. Nature Sustainability, 2(8), 674–680. https://doi.org/10.1038/s41893-019-0334-y

Sachs, J., Lafortune, G., Fuller, G., & Drumm, E. (2023). Sustainable Development Report 2023. Sustainable Development Solutions Network. https://dashboards.sdgindex.org/

Authors. (2025). Analyzing Indian states’ SDG progress via complex network framework. arXiv Preprint. https://arxiv.org/abs/2501.05314

Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santaámaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), 53. https://doi.org/10.1186/s40537-021-00444-8

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesneau, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://jmlr.org/papers/v12/pedregosa11a.html

James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor, J. (2023). An introduction to statistical learning with applications in Python. Springer. https://hastie.su.domains/ISLP/ISLP_website.pdf

McKinney, W. (2010). Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference (SciPy 2010), 56–61. https://conference.scipy.org/proceedings/scipy2010/pdfs/mckinney.pdf

Park, A., Narechania, A., & Fulda, J. (2021). Streamlit: Rapidly building and sharing machine learning applications. arXiv Preprint. https://arxiv.org/abs/2105.03855

Dasgupta, R., Kumar, P., & Singh, A. (2025). Health SDGs at risk from climate change: Evidence from India. PLOS ONE. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12654917/