Does

Isha Gautam Sontakke; Unnati Nitin Shrivastava; Sahiba Kamal Siddiqui; Shrunkhal Moreshwar Supale; Pushpa Tandekar

doi:https://doi.org/10.47001/IRJIET/2026.105004

Does

Isha Gautam SontakkeStudent, Computer Science and Engineering, Shri Sai College of Engineering and Technology, Bhadrawati, Chandrapur, IndiaUnnati Nitin ShrivastavaStudent, Computer Science and Engineering, Shri Sai College of Engineering and Technology, Bhadrawati, Chandrapur, IndiaSahiba Kamal SiddiquiStudent, Computer Science and Engineering, Shri Sai College of Engineering and Technology, Bhadrawati, Chandrapur, IndiaShrunkhal Moreshwar SupaleStudent, Computer Science and Engineering, Shri Sai College of Engineering and Technology, Bhadrawati, Chandrapur, IndiaPushpa TandekarProfessor, Computer Science and Engineering, Shri Sai College of Engineering and Technology, Bhadrawati, Chandrapur, India

Vol 10 No 5 (2026): Volume 10, Issue 5, May 2026 | Pages: 22-28

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 05-05-2026

doi.org/10.47001/IRJIET/2026.105004

Full Text PDF

Abstract

Every text query sent to a modern large language model evaporates a small but measurable quantity of fresh water, directly through data-centre cooling and indirectly through power generation. At the scale of a popular consumer API, this aggregates to volumes equivalent to the household water consumption of a small town. We ask a narrow, practical question: by how much can a user reduce that footprint simply by changing how the prompt is written? We run a controlled experiment across four open-weight models served by OpenRouter, twenty standardised prompts spanning factual recall, reasoning, summarisation, and coding, and three prompting conditions, for a total of 266 controlled inferences. We separate direct (on-site cooling) from indirect (grid electricity) water, an accounting distinction the academic literature treats as essential [2, 3] but corporate sustainability disclosures routinely collapse [6]. On a fully sampled 20-billion-parameter model, prompts that ask for shorter answers reduce output tokens by 62-65% and water by 54-56% relative to an unconstrained baseline, with no measurable quality loss across all four task categories. Two cross-model findings sharpen the picture. On a 1.2-billion-parameter edge model, the same instruction reduces tokens but causes a quality cliff under one phrasing and not the other. On a 30-billion-parameter reasoning-tuned MoE model, an instruction to “answer in under 50 words” increases output tokens by 24%, the model interprets the instruction as a request for more careful reasoning rather than for shorter output. Prompt design is a real, immediately deployable user-side lever for AI sustainability; it is also an architecturally fragile one whose effect must be characterised per model class rather than assumed.

Keywords

Large Language Model Inference, Prompt Engineering, Water Footprint, Energy Efficiency, Sustainable AI, Token Reduction, Open-Weight Models, OpenRouter, Water Usage Effectiveness, Green Computing

Citation of this Article

Isha Gautam Sontakke, Unnati Nitin Shrivastava, Sahiba Kamal Siddiqui, Shrunkhal Moreshwar Supale, & Pushpa Tandekar. (2026). Does "Be Concise" Save Water? Measuring the Effect of Prompt Design on the Energy and Water Footprint of Open-Weight LLM Inference. International Research Journal of Innovations in Engineering and Technology - IRJIET, 10(5), 22-28. Article DOI https://doi.org/10.47001/IRJIET/2026.105004

This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence

References

N. Jegham et al., "How Hungry is AI? Benchmarking the Energy and Water Footprint of LLM Inference Across 30 Models," arXiv preprint, 2025.
S. Ren et al., "Making AI Less Thirsty: A Methodological Critique of Top-Down Water Footprint Estimates," Communications of the ACM, 2024.
A.de Vries, "The Hidden Resource Cost of Generative AI Infrastructure," Joule / ScienceDirect Patterns, 2025.
P. Li, J. Yang, M. A. Islam, and S. Ren, "Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models," arXiv:2304.03271, 2023 (updated 2024).
Lawrence Berkeley National Laboratory, "2024 United States Data Center Energy Usage Report," prepared for the U.S. Department of Energy, 2024.
Microsoft, "Environmental Sustainability Report FY2025"; Google, "Environmental Report 2024"; Meta, "Sustainability Report 2024"; Amazon, "Sustainability Report 2024" (aggregated and discussed in [5]).
NVIDIA, architectural disclosures for the H100, B200, and Rubin-class accelerators, 2024–2025.
A.H. Khalaj and S. K. Halgamuge, "A Review of Cooling Technologies for High-Density Data Centres," 2025.
S. Luccioni, Y. Jernite, and E. Strubell, "Power Hungry Processing: Watts Driving the Cost of AI Deployment?" in Proc. ACM FAccT, 2024.
A.de Vries, "The Growing Energy Footprint of Artificial Intelligence," Joule, vol. 7, no. 10, pp. 2191–2194, 2023.
"TokenPowerBench: Node-Level Energy Profiling of Large Language Model Inference," 2025.
Morgan Stanley Research, "AI Infrastructure and Water Stress: A Geospatial Analysis," 2025; MSCI ESG Research, "Data Center Asset-Level Climate Risk Assessment," 2025.
"Sprout: Carbon-Aware Token Routing for LLM Inference," 2024.
J. Stojkovic et al., "Energy-per-Token Routing in Production LLM Serving," in Proc. EuroMLSys, 2025.
Google, "Methodology for Estimating Per-Prompt Energy and Water Consumption of Gemini Models," 2025.
International Energy Agency, "Electricity 2024" and "World Energy Outlook 2025."

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

Does

Abstract

Keywords

Citation of this Article

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links