Entropy-Regularized Deep Reinforcement Learning for Stochastic Voltage Regulation under High Renewable Penetration

Tarun Kumar ModiDepartment of Electrical Engineering, Sardar Patel University, Balaghat, IndiaNaresh SapateDepartment of Electrical Engineering, Sardar Patel University, Balaghat, IndiaShailendra TurkarDepartment of Electrical Engineering, Sardar Patel University, Balaghat, India

Vol 10 No 5 (2026): Volume 10, Issue 5, May 2026 | Pages: 744-757

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 31-05-2026

doi Logo doi.org/10.47001/IRJIET/2026.105100

Abstract

The rapid growth of renewable energy sources, particularly solar photovoltaic and wind generation, has fundamentally changed the operating characteristics of modern power distribution networks. The inherent variability and limited predictability of these resources create substantial challenges for voltage regulation, as traditional control schemes struggle to respond effectively to fast and unpredictable fluctuations. This paper presents an entropy-regularized deep reinforcement learning framework designed specifically for stochastic voltage regulation in distribution grids with high renewable penetration. Unlike conventional reinforcement learning methods that converge to deterministic policies, the proposed approach maintains policy stochasticity through entropy regularization, which encourages exploration and improves robustness against the uncertainty introduced by renewable generation. We develop a Soft Actor-Critic based control agent that coordinates reactive power from smart inverters, on-load tap changers, and static var compensators to maintain voltage within acceptable bounds while accounting for the probabilistic nature of renewable output. The framework is validated through extensive simulations on a modified IEEE 33- bus test system with 65% renewable penetration. Results demonstrate that the entropy-regularized approach reduces voltage violations by 87% compared to rule-based control and achieves 23% better performance than standard deep reinforcement learning methods under highly variable generation conditions. The proposed method also exhibits superior generalization when tested on unseen scenarios with different renewable generation patterns.

Keywords

Deep reinforcement learning, entropy regularization, voltage regulation, renewable energy, distribution networks, soft actor-critic, stochastic control.


Citation of this Article

Tarun Kumar Modi, Naresh Sapate, & Shailendra Turkar. (2026). Entropy-Regularized Deep Reinforcement Learning for Stochastic Voltage Regulation under High Renewable Penetration. International Research Journal of Innovations in Engineering and Technology - IRJIET, 10(5), 744-757.

References
T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actorcritic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 1861–1870.

T. Haarnoja, A. Zhou, K. Hartikainen, G. Tucker, S. Ha, J. Tan, V. Kumar, H. Zhu, A. Gupta, P. Abbeel, and S. Levine, “Soft actor-critic algorithms and applications,” arXiv preprint arXiv:1812.05905, 2018.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” in Proc. Int. Conf. Learn. Represent., 2016.

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.

B. D. Ziebart, “Modeling purposeful adaptive behavior with the principle of maximum causal entropy,” Ph.D. dissertation, Carnegie Mellon Univ., Pittsburgh, PA, USA, 2010.

Z. Ahmed, N. Le Roux, M. Norouzi, and D. Schuurmans, “Understanding the impact of entropy on policy optimization,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 151–160.

B. Eysenbach and S. Levine, “Maximum entropy RL (provably) solves some robust RL problems,” in Proc. Int. Conf. Learn. Represent., 2022.

R. Tonkoski, D. Turcotte, and T. H. M. El-Fouly, “Impact of high PV penetration on voltage profiles in residential neighborhoods,” IEEE Trans. Sustain. Energy, vol. 3, no. 3, pp. 518–527, Jul. 2012.

K. Turitsyn, P. Sulc, S. Backhaus, and M. Chertkov, “Options for control of reactive power by distributed photovoltaic generators,” Proc. IEEE, vol. 99, no. 6, pp. 1063–1073, Jun. 2011.

IEEE Standard 1547-2018, “IEEE Standard for Interconnection and Interoperability of Distributed Energy Resources with Associated Electric Power Systems Interfaces,” IEEE, 2018.

P. Jahangiri and D. C. Aliprantis, “Distributed Volt/VAr control by PV inverters,” IEEE Trans. Power Syst., vol. 28, no. 3, pp. 3429–3439, Aug. 2013.

M. Farivar, R. Neal, C. Clarke, and S. Low, “Optimal inverter VAR control in distribution systems with high PV penetration,” in Proc. IEEE Power Energy Soc. Gen. Meeting, 2012, pp. 1–7.

E. Dall’Anese, S. V. Dhople, and G. B. Giannakis, “Optimal dispatch of photovoltaic inverters in residential distribution systems,” IEEE Trans. Sustain. Energy, vol. 5, no. 2, pp. 487– 497, Apr. 2014.

S. H. Low, “Convex relaxation of optimal power flow—Part I: Formulations and equivalence,” IEEE Trans. Control Netw. Syst., vol. 1, no. 1, pp. 15–27, Mar. 2014.

M. E. Baran and F. F. Wu, “Network reconfiguration in distribution systems for loss reduction and load balancing,” IEEE Trans. Power Del., vol. 4, no. 2, pp. 1401–1407, Apr. 1989.

Q. Yang, G. Wang, A. Sadeghi, G. B. Giannakis, and J. Sun, “Two-timescale voltage control in distribution grids using deep reinforcement learning,” IEEE Trans. Smart Grid, vol. 11, no. 3, pp. 2313–2323, May 2020.

W. Wang, N. Yu, Y. Gao, and J. Shi, “Safe off-policy deep reinforcement learning algorithm for Volt-VAR control in power distribution systems,” IEEE Trans. Smart Grid, vol. 11, no. 4, pp. 3008–3018, Jul. 2020.

Y. Zhang, X. Wang, J. Wang, and Y. Zhang, “Deep reinforcement learning based Volt-VAR optimization in smart distribution systems,” IEEE Trans. Smart Grid, vol. 12, no. 1, pp. 361–371, Jan. 2021.

D. Cao, W. Hu, J. Zhao, G. Zhang, B. Zhang, Z. Liu, Z. Chen, and F. Blaabjerg, “Reinforcement learning and its applications in modern power and energy systems: A review,” J. Mod. Power Syst. Clean Energy, vol. 8, no. 6, pp. 1029–1042, Nov. 2020.

J. Duan, D. Shi, R. Diao, H. Li, Z. Wang, B. Zhang, D. Bian, and Z. Yi, “Deep-reinforcement-learning-based autonomous voltage control for power grid operations,” IEEE Trans. Power Syst., vol. 35, no. 1, pp. 814–817, Jan. 2020.

M. Sun, I. Konstantelos, and G. Strbac, “A deep learningbased feature extraction framework for system security assessment,” IEEE Trans. Smart Grid, vol. 10, no. 5, pp. 5007–5020, Sep. 2019.

Y. Xu, W. Zhang, W. Liu, and F. Ferrese, “Multiagent-based reinforcement learning for optimal reactive power dispatch,” IEEE Trans. Syst. Man Cybern. C, Appl. Rev., vol. 42, no. 6, pp. 1742–1751, Nov. 2012.

P. Christodoulou, “Soft actor-critic for discrete action settings,” arXiv preprint arXiv:1910.07207, 2019.

J. Fu, A. Kumar, M. Soh, and S. Levine, “Diagnosing bottlenecks in deep Q-learning algorithms,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 2021–2030.

D. Hafner, T. Lillicrap, M. Norouzi, and J. Ba, “Mastering Atari with discrete world models,” in Proc. Int. Conf. Learn. Represent., 2021.

A. Kumar, A. Zhou, G. Tucker, and S. Levine, “Conservative Q-learning for offline reinforcement learning,” in Proc. Adv. Neural Inf. Process. Syst., 2020, pp. 1179–1191.

International Energy Agency, “World Energy Outlook 2023,” IEA Publications, Paris, 2023.

A. Kulmala, S. Repo, and P. J¨arventausta, “Coordinated voltage control in distribution networks including several distributed energy resources,” IEEE Trans. Smart Grid, vol. 5, no. 4, pp. 2010–2020, Jul. 2014.

Y. P. Agalgaonkar, B. C. Pal, and R. A. Jabr, “Distribution voltage control considering the impact of PV generation on tap changers and autonomous regulators,” IEEE Trans. Power Syst., vol. 29, no. 1, pp. 182–192, Jan. 2014.

H. Zhu and H. J. Liu, “Fast local voltage control under limited reactive power: Optimality and stability analysis,” IEEE Trans. Power Syst., vol. 31, no. 5, pp. 3794–3803, Sep. 2016.