Stars
WildEval / ZeroEval
Forked from allenai/WildBenchA simple unified framework for evaluating LLMs
AI Agent leveraging symbolic reasoning and other auxiliary tools to boost its capabilities on various logic and reasoning benchmarks. This project aims to develop a robust and flexible AI system th…