- San Francisco, CA, United States
- in/andy-lee-b68302232
-
Tinker Public
The world's first System Design Engineer AI Agent.
-
Codex_Playground Public
A playground repo to play around with OpenAI Codex.
-
BICS_Plus Public
Benchmark that tests LLMs to find semantic bugs in large Python code.
-
BioServices_DS Public
Dataset of BioService code containing signatures and docstrings.
-
bioservices Public
Forked from cokelaer/bioservicesAccess to Biological Web Services from Python.
-
MLGit Public
MLGit: Index Codebase into Natural Language Descriptions; Works Just Like Git.
-
MLGit_Test_Repo_N2 Public
A Test Repo for MLGit; Python; Relative Imports.
-
MLGit_Test_Repo_N1 Public
A Test Repo for MLGit; Python; Absolute Imports.
-
-
bug_in_the_code_stack Public
A new benchmark for measuring LLM's capability to detect bugs in large codebase.
-
watai_hammingai_project Public
WAT.ai x Hamming.ai Joint Project for Building Code Debugging Benchmarks and Models.
-
debugger_llm Public
Open-source datasets & models for LLM Judges to find and describe bugs in LLM-generated code.
-
Codegen_Challenge_Submission Public
A Python import visualization program.
-
bug_in_the_code_stack_v2 Public
Can LLMs find bugs that compilers can't?: A benchmark for measuring LLMs' capabilities in debugging large source code.
-
Open-source ECCC repository for notebooks and documentations for the Hail Forecasting project by Hokyung (Andy) Lee.
-
eccc-webcam-project Public
Open-source ECCC repository for notebooks and documentations for the Webcam project by Hokyung (Andy) Lee.
-
LVEval Public
Forked from infinigence/LVEvalRepository of LV-Eval Benchmark
-
babilong Public
Forked from booydar/babilongBABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
-
awesome-llm-metrics Public
An open-source framework that makes evaluating LLMs & prompt engineering x10 easier!
-
RagTagTeam Public
Startup co-founder matching platform built using Cohere for the WAT.AI RAG Challenge hackathon.
-
GreenTechGuardians Public
A Circular Economy business idea evaluator tool built using Gen-AI.
-
racecar_gym Public
Forked from axelbr/racecar_gymA gym environment for a miniature racecar using the pybullet physics engine.
-
CrafterGPT Public
Leveraging Language Model to Play Procedurally-Generated Survival Games.
-
Space Invaders agent trained using DQN/A2C models on OpenAI Gym Atari Environment.
-
LLM_Reward_Model Public
Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.
-
crafter Public
Forked from danijar/crafterBenchmarking the Spectrum of Agent Capabilities
-
ExchangeAgent Public
Training a stock exchange agent with Reinforcement Learning algorithms and Decision Transformer.
-
FinancialBERT Public
Stock price prediction model built using BERT and regression model trained on textual financial news data.
-
rank_llm Public
Forked from castorini/rank_llmRepository for prompt-decoding using LLMs (GPT3.5, GPT4, and Vicuna)
-
torchgym Public
A PyTorch library that provides major RL algorithms and functionalities for training OpenAI Gym agents.