techandy42

☀️

Being cracked.

Andy Lee techandy42

☀️

Being cracked.

21 | Prev @ Glean, Carta | 4A CS @ U of Waterloo

15 followers · 15 following

San Francisco, CA, United States
in/andy-lee-b68302232

Achievements

Starred repositories

techandy42 / Codex_Playground

A playground repo to play around with OpenAI Codex.

Python 1 Updated Jun 29, 2025

techandy42 / Tinker

The world's first System Design Engineer AI Agent.

TypeScript 1 Updated Jul 12, 2025

techandy42 / BioServices_DS

Dataset of BioService code containing signatures and docstrings.

Jupyter Notebook 1 Updated Jun 4, 2025

techandy42 / bioservices

Forked from cokelaer/bioservices

Access to Biological Web Services from Python.

Python 1 Updated May 19, 2025

techandy42 / MLGit_Test_Repo_N2

A Test Repo for MLGit; Python; Relative Imports.

Python 1 Updated May 18, 2025

techandy42 / MLGit_Test_Repo_N1

A Test Repo for MLGit; Python; Absolute Imports.

Python 1 Updated May 18, 2025

techandy42 / BICS_Plus

Benchmark that tests LLMs to find semantic bugs in large Python code.

Python 2 Updated Jun 11, 2025

techandy42 / MLGit

MLGit: Index Codebase into Natural Language Descriptions; Works Just Like Git.

Python 1 Updated May 18, 2025

batmen-lab / BioMANIA

Python 68 4 Updated Mar 24, 2025

techandy42 / watai_hammingai_project

WAT.ai x Hamming.ai Joint Project for Building Code Debugging Benchmarks and Models.

Python 5 Updated May 3, 2025

techandy42 / Codegen_Challenge_Submission

A Python import visualization program.

Jupyter Notebook 1 Updated Sep 14, 2024

SYS-NG / Goose_Guru_HTN2024

TypeScript 1 Updated Sep 16, 2024

techandy42 / debugger_llm

Open-source datasets & models for LLM Judges to find and describe bugs in LLM-generated code.

Jupyter Notebook 2 Updated Nov 9, 2024

techandy42 / bug_in_the_code_stack

A new benchmark for measuring LLM's capability to detect bugs in large codebase.

Jupyter Notebook 3 5 Updated May 3, 2025

nonsequitoria / simplekit

SimpleKit

TypeScript 5 34 Updated Oct 2, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,285 1,764 Updated Oct 13, 2025

Devinterview-io / llms-interview-questions

🟣 LLMs interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.

615 75 Updated May 19, 2025

HammingHQ / bug-in-the-code-stack

Forked from techandy42/bug_in_the_code_stack

A new benchmark for measuring LLM's capability to detect bugs in large codebase.

Jupyter Notebook 32 4 Updated Jun 5, 2024

techandy42 / bug_in_the_code_stack_v2

Can LLMs find bugs that compilers can't?: A benchmark for measuring LLMs' capabilities in debugging large source code.

Jupyter Notebook 1 Updated May 29, 2024

techandy42 / awesome-llm-metrics

An open-source framework that makes evaluating LLMs & prompt engineering x10 easier!

Python 3 Updated Mar 20, 2024

techandy42 / eccc-webcam-project

Open-source ECCC repository for notebooks and documentations for the Webcam project by Hokyung (Andy) Lee.

Jupyter Notebook 1 Updated Apr 26, 2024

techandy42 / eccc-hail-forecasting-project

Open-source ECCC repository for notebooks and documentations for the Hail Forecasting project by Hokyung (Andy) Lee.

Jupyter Notebook 1 Updated Apr 26, 2024

techandy42 / LVEval

Forked from infinigence/LVEval

Repository of LV-Eval Benchmark

Jupyter Notebook 1 Updated Apr 15, 2024

infinigence / LVEval

Repository of LV-Eval Benchmark

Python 70 9 Updated Aug 31, 2024

techandy42 / babilong

Forked from booydar/babilong

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.

Jupyter Notebook 1 Updated Apr 15, 2024

booydar / babilong

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.

Jupyter Notebook 215 21 Updated Sep 2, 2025

Arize-ai / LLMTest_NeedleInAHaystack

Forked from gkamradt/LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Python 106 4 Updated Sep 19, 2025

bing1100 / hamming_m3

m3 dataset with hamming

TypeScript 1 Updated Apr 26, 2024

HammingHQ / hamming-examples

Various examples on how to use Hamming for evals + observability

TypeScript 6 1 Updated Jan 27, 2025

genai-genesis-2024 / web-agent

Python 2 Updated Mar 31, 2024

Andy Lee techandy42

Starred repositories

Java

JavaFX