Skip to content
forked from emarche/RL2Grid

RL2Grid is a standardized benchmark for reinforcement learning (RL) agents in realistic power grid environments. Built on top of Grid2Op, it models real-time operations such as topology optimization and redispatching, with full AC power flow simulation, stochastic events, expert-informed heuristics, and safety-critical constraints.

License

Notifications You must be signed in to change notification settings

colin-fox/RL2Grid

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚡ RL2Grid: Benchmarking RL for Power Grid Operations

RL2Grid is a realistic and standardized reinforcement learning benchmark for power grid operations, developed in close collaboration with major Transmission System Operators (TSOs). It builds upon Grid2Op and extends the widely-used CleanRL framework to provide:

  • ✅ Standardized environments, state/action spaces, and reward structures
  • ♻️ Realistic transition dynamics incorporating stochastic grid events and human heuristics
  • ⚠️ Safe RL tasks via constrained MDPs, with load shedding and thermal overload constraints
  • 🧪 Extensive baselines including DQN, PPO, SAC, TD3, and Lagrangian PPO
  • 📊 Integration with Weights & Biases (wandb) for experiment tracking
  • 🧠 Designed to provide a framework for algorithmic innovation and safe control in power grids

🔧 Installation

First, ensure you have Miniconda installed.

# Step 1: Clone the repository
git clone https://github.com/emarche/RL2Grid.git
cd RL2Grid

# Step 2: Create the environment
conda env create -f conda_env.yml

# Step 3: Activate the environment
conda activate rl2grid

# Step 4: Install RL2Grid
pip install .

🚀 Quick Start

Before running an experiment, make sure to unzip the action spaces env/action_spaces.zip!

To run training on a predefined task (remember to set up the correct entity and project for wandb in the main.py script):

python main.py --env-id bus14 --action-type topology --alg PPO

Available arguments include task difficulty, action type (topology/redispatch), reward weights, constraint types, and more. Check main.py and alg/<algorithm>/config.py for the full configuration space.


🧪 Benchmark Environments

RL2Grid supports 39 distinct tasks across discrete (topological) and continuous (redispatch/curtailment) settings. The main grid variations include:

Grid ID Action Type Contingencies Batteries Constraints Difficulty Levels
bus14 Topology, Redispatch Maintenance No Optional 0-1
bus36-MO-v0 Topology, Redispatch Maintenance + Opponent No Optional 0–4
bus118-MOB-v0 Topology, Redispatch Maintenance + Opponent + Battery Yes Optional 0–4

Full environment specs and task variants are detailed in the paper.


🧠 Built-In Heuristics

To bridge human expertise with RL training, RL2Grid embeds two human-informed heuristics:

  • idle: suppresses agent actions during normal grid operations
  • recovery: gradually restores topology toward the original configuration when the grid operates under normal condition

Heuristic guidance can be toggled via command-line arguments (see env/config.py).


✅ Safe RL Support

RL2Grid natively supports CMDP-style safety constraints, including:

  • Load Shedding & Islanding (LSI) – penalizes disconnected grid regions or unmet demand
  • Thermal Line Overloads (TLO) – penalizes line overloads and disconnections

These constraints can be incorporated using Lagrangian methods (e.g., LagrPPO).


📈 Baseline Results

RL2Grid includes implementations and benchmark results for:

  • Discrete (topological): DQN, PPO, SAC (+ heuristic variants)
  • Continuous (redispatch): PPO, SAC, TD3
  • Constrained: Lagrangian PPO (LSI, TLO tasks)

Performance is measured via normalized grid survival rate, overload penalties, topology modifications, and cost metrics.


📚 Documentation


🌍 Environmental Impact

We are committed to responsible research. Experiments were run with carbon offsets purchased via Treedom and estimated via MLCO2.


📬 Citation

This project was developed in collaboration with RTE France, 50Hertz, National Grid ESO, MIT, Georgia Tech, University of Edinburgh.

If you use RL2Grid, please cite:

@misc{rl2grid,
      title={RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations}, 
      author={Enrico Marchesini and Benjamin Donnot and Constance Crozier and Ian Dytham and Christian Merz and Lars Schewe and Nico Westerbeck and Cathy Wu and Antoine Marot and Priya L. Donti},
      year={2025},
      eprint={2503.23101},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2503.23101}, 
}

License

RL2Grid is licensed under the MIT License. For more details, please refer to the LICENSE file in this repository.

About

RL2Grid is a standardized benchmark for reinforcement learning (RL) agents in realistic power grid environments. Built on top of Grid2Op, it models real-time operations such as topology optimization and redispatching, with full AC power flow simulation, stochastic events, expert-informed heuristics, and safety-critical constraints.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%