This project uses uv as the package manager, after cloning the repository and installing uv, run the following command:
cd EventRAG
uv venv
source .venv/bin/activate
uv sync
uv pip install -e .Use the below Python snippet (in a script) to initialize EventRAG and perform queries:
import os
from eventrag import EventRAG, QueryParam
from eventrag.llm import gpt_4o_mini_complete, gpt_4o_complete
#########
# Uncomment the below two lines if running in a jupyter notebook to handle the async nature of rag.insert()
# import nest_asyncio
# nest_asyncio.apply()
#########
WORKING_DIR = "./dickens"
if not os.path.exists(WORKING_DIR):
    os.mkdir(WORKING_DIR)
rag = EventRAG(
    working_dir=WORKING_DIR,
    llm_model_func=gpt_4o_mini_complete  # Use gpt_4o_mini_complete LLM model
    # llm_model_func=gpt_4o_complete  # Optionally, use a stronger model
)
with open("./book.txt") as f:
    rag.insert(f.read())
# Perform multi-event reasoning
print(rag.query("What are the top themes in this story?", param=QueryParam(mode="agent")))The code can be found in the ./reproduce directory.
In all the experiments, we use the gpt_4o_complete model for LLM generation, and the openai_embedding function for embedding.
We use Neo4J as the graph storage and Milvus as the vector storage for all experiments. You can setup them using the dockert-compose file in the ./dockers directory.
| Parameter | Type | Explanation | Default | 
|---|---|---|---|
| working_dir | str | 
Directory where the cache will be stored | eventrag_cache+timestamp | 
| kv_storage | str | 
Storage type for documents and text chunks. Supported types: JsonKVStorage, OracleKVStorage | 
JsonKVStorage | 
| vector_storage | str | 
Storage type for embedding vectors. Supported types: NanoVectorDBStorage, OracleVectorDBStorage | 
NanoVectorDBStorage | 
| graph_storage | str | 
Storage type for graph edges and nodes. Supported types: NetworkXStorage, Neo4JStorage, OracleGraphStorage | 
NetworkXStorage | 
| log_level | Log level for application runtime | logging.DEBUG | 
|
| chunk_token_size | int | 
Maximum token size per chunk when splitting documents | 1200 | 
| chunk_overlap_token_size | int | 
Overlap token size between two chunks when splitting documents | 100 | 
| tiktoken_model_name | str | 
Model name for the Tiktoken encoder used to calculate token numbers | gpt-4o-mini | 
| entity_extract_max_gleaning | int | 
Number of loops in the entity extraction process, appending history messages | 1 | 
| entity_summary_to_max_tokens | int | 
Maximum token size for each entity summary | 500 | 
| node_embedding_algorithm | str | 
Algorithm for node embedding (currently not used) | node2vec | 
| node2vec_params | dict | 
Parameters for node embedding | {"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,} | 
| embedding_func | EmbeddingFunc | 
Function to generate embedding vectors from text | openai_embedding | 
| embedding_batch_num | int | 
Maximum batch size for embedding processes (multiple texts sent per batch) | 32 | 
| embedding_func_max_async | int | 
Maximum number of concurrent asynchronous embedding processes | 16 | 
| llm_model_func | callable | 
Function for LLM generation | gpt_4o_mini_complete | 
| llm_model_name | str | 
LLM model name for generation | meta-llama/Llama-3.2-1B-Instruct | 
| llm_model_max_token_size | int | 
Maximum token size for LLM generation (affects entity relation summaries) | 32768 | 
| llm_model_max_async | int | 
Maximum number of concurrent asynchronous LLM processes | 16 | 
| llm_model_kwargs | dict | 
Additional parameters for LLM generation | |
| vector_db_storage_cls_kwargs | dict | 
Additional parameters for vector database (currently not used) | |
| enable_llm_cache | bool | 
If TRUE, stores LLM results in cache; repeated prompts return cached responses | 
TRUE | 
| addon_params | dict | 
Additional parameters, e.g., {"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]}: sets example limit and output language | 
example_number: all examples, language: English | 
| convert_response_to_json_func | callable | 
Not used | convert_response_to_json | 
| embedding_cache_config | dict | 
Configuration for question-answer caching. Contains three parameters: - enabled: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.- similarity_threshold: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.- use_llm_check: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. | 
Default: {"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False} | 
MIT License