CEGS is an advanced network configuration synthesis system that leverages Graph Neural Networks (GNNs) and Large Language Models (LLMs) to automate network configuration synthesis. It can understand high-level user intents, identify and generalize from configuration examples, and generate correct, verifiable network configurations for arbitrary topologies.
- Python 3.8+
- Docker (for Batfish service)
- OpenAI API key (or other supported LLM API)
-
Clone the repository:
git clone https://github.com/your-username/cegs.git cd cegs/CEGS -
Install Python dependencies:
pip install -r requirements.txt
-
Start Batfish service:
docker run -d --name batfish -p 9997:9997 -p 9996:9996 batfish/batfish:latest
-
Configure environment variables: Set your LLM API key in
setting.json.
The tool uses a setting file setting.json to specify all necessary parameters. You can adjust these parameters according to your needs. Before running, modify the setting.json file to set up system parameters, especially the LLM provider and API key.
- LLM: Choose from OPENAI, GEMINI, or DEEPSEEK providers and set corresponding API keys and models
- AI Models: Set paths for GNN model and sentence transformer
- Data Paths: Set paths for example library, intent types, and input files
- Output Paths: Set directories for generated device configurations
- Prompt Templates: Set paths for various LLM prompt template files used in different generation phases
- Generation Parameters: Set batch size and other synthesis parameters
{
"LLM_PROVIDER": "OPENAI",
"GEMINI_API_KEY": "sk-your-gemini-api-key-here",
"GEMINI_MODEL": "gemini-2.5-flash-preview-05-20",
"OPENAI_API_KEY": "sk-your-openai-api-key-here",
"OPENAI_MODEL": "gpt-4o",
"DEEPSEEK_API_KEY": "sk-your-deepseek-api-key-here",
"DEEPSEEK_MODEL": "deepseek-chat",
"GCN_MODEL_PATH": "best_model.pth",
"SENTENCE_TRANSFORMER_MODEL": "all-MiniLM-L6-v2",
"EXAMPLE_LIBRARY_PATH": "ExampleLibrary.json",
"INTENT_TYPES_PATH": "IntentTypes/Types.json",
"INPUT_INTENT_FILE": "input/intent.txt",
"INPUT_TOPOLOGY_FILE": "input/topology.json",
"RESPONSES_DIR": "responses",
"OUTPUT_DIR": "output",
"CONFIGS_DIR": "output/configs/",
"INTENT_PROCESS_FILE": "prompts/intentprocess.txt",
"INTENT_SPECIFICATION_FILE": "prompts/intentSpecification.txt",
"REFLECTION_PREFIX_FILE": "prompts/reflection/prefix.txt",
"INTERFACE_SUFFIX_FILE": "prompts/reflection/interface_suffix.txt",
"OSPF_SUFFIX_FILE": "prompts/reflection/ospf_suffix.txt",
"BGP_SUFFIX_FILE": "prompts/reflection/bgp_suffix.txt",
"BGP_ROUTING_POLICY_FILE": "prompts/reflection/bgp_routing_policy.txt",
"ROUTEMAP_FORMAT_FILE": "prompts/reflection/routeMap_format.txt",
"INTENT_TYPES_ROLE_PROMPT_DIR": "IntentTypes/rolePrompt",
"BATCH_SIZE": 40
}CEGS relies on pre-trained word embedding models for its semantic understanding capabilities. One of these models is too large to be included directly in this Git repository.
Required Action: Before running the system, you must download the following file and place it in the root directory of the project:
- File:
wiki-news-300d-1M-subword.vec - Download URL: Official download link here
- Target Location:
CEGS/wiki-news-300d-1M-subword.vec/wiki-news-300d-1M-subword.vec
After downloading, your directory structure should look like this:
CEGS/
├── wiki-news-300d-1M-subword.vec/
│ └── wiki-news-300d-1M-subword.vec
├── main_syn.py
├── querier.py
└── ... other files
To run the synthesis process with a target scenario data as the input, execute the main synthesis script:
python main_syn.pyThe system will begin the workflow, processing the intents and topology defined in its input files and generating the final configurations in the output/ directory.
The file ExampleLibrary.json is a configuration example library containing numerous examples that encompass various routing intents for OSPF and BGP protocols.
The dataset folder contains example target scenario datas. Each data includes intents and target topology.
- Use two-stage recommendation strategy to identify relevant configuration examples
- Combine semantic similarity and topological similarity
- Implemented with SBERT, FastText and GraphSAGE
- Establish device association relations between target and example topologies
- First associate devices based on node's role descriptions
- Then associate devices based on neighborhood similarity using a GNN model
- Syntax Verifier: Check configuration syntax correctness
- Local Attribute Verifier (LAV): Verify individual device configurations
- Global Formal Verifier (GFV): Use SMT solver to verify network-wide policies
- Based on NetComplete implementation
- Use SMT constraint solving to fill template parameters
- Guarantee correctness of final configurations
CEGS/
├── main_syn.py # Main entry point
├── querier.py # Querier implementation
├── classifier.py # Classifier implementation
├── generator.py # Configuration generator
├── utils.py # Utility functions
├── Semantic_verifier.py # Semantic verifier
├── Syntax_verifier.py # Syntax verifier
├── config_manager.py # Configuration manager
├── setting.json # Configuration file
├── prompts/ # LLM prompt templates
├── dataset/ # Datasets for training
├── input/ # Input intent and topology
├── output/ # Output directory
└── requirements.txt # Dependencies
This project is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
If you use CEGS in your research, please cite our paper:
@inproceedings {306009,
author = {Jianmin Liu and Li Chen and Dan Li and Yukai Miao},
title = {{CEGS}: Configuration Example Generalizing Synthesizer},
booktitle = {22nd USENIX Symposium on Networked Systems Design and Implementation (NSDI 25)},
year = {2025},
isbn = {978-1-939133-46-5},
address = {Philadelphia, PA},
pages = {1327--1347},
url = {https://www.usenix.org/conference/nsdi25/presentation/liu-jianmin},
publisher = {USENIX Association},
month = apr
}For questions or suggestions, please contact us through GitHub Issues.
This README provides an overview of the CEGS system. For more detailed information on the methodology and evaluation, please refer to the full paper.