DHdroid

Follow

👋

Donghyun Son DHdroid

👋

Follow

CSE @ SNU

41 followers · 62 following

Seoul National University
Seoul
https://dhdroid.github.io/
@dhdroiid

Achievements

Achievements

Organizations

Stars

SqueezeAILab / MultipoleAttention

[NeurIPS 2025] Multipole Attention for Efficient Long Context Reasoning

14 Updated Jun 10, 2025

jjiantong / Awesome-KV-Cache-Optimization

[Survey] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization

Python 156 4 Updated Nov 10, 2025

snu-mllab / KVzip

[NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)

Python 152 7 Updated Nov 21, 2025

HKUDS / AI-Trader

"AI-Trader: Can AI Beat the Market?" Live Trading Bench: https://ai4trade.ai

Python 9,690 1,516 Updated Nov 26, 2025

crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

Python 40,838 5,459 Updated Nov 27, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 37,662 4,616 Updated Nov 17, 2025

worv-ai / D2E

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

56 2 Updated Oct 20, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 9,305 1,145 Updated Nov 3, 2025

vllm-project / recipes

Common recipes to run vLLM

Jupyter Notebook 244 89 Updated Nov 26, 2025

InferenceMAX / InferenceMAX

Open Source Continuous Inference Benchmarking - GB200 NVL72 vs MI355X vs B200 vs H200 vs MI325X & soon™ TPUv6e/v7/Trainium2/3/GB300 NVL72 - DeepSeek 670B MoE, GPTOSS

Python 383 55 Updated Nov 27, 2025

LLMServe / DistServe

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 736 81 Updated Apr 6, 2025

hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 584 71 Updated Sep 11, 2024

deepseek-ai / DeepSeek-V3

Python 100,411 16,368 Updated Aug 28, 2025

thinking-machines-lab / batch_invariant_ops

Python 912 71 Updated Nov 4, 2025

a2aproject / A2A

An open protocol enabling communication and interoperability between opaque agentic applications.

Shell 20,834 2,127 Updated Nov 26, 2025

PiotrNawrot / sparse-frontier

The evaluation framework for training-free sparse attention in LLMs

Python 106 8 Updated Oct 13, 2025

FoundationAgents / MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 59,709 7,314 Updated Oct 4, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,558 712 Updated Nov 27, 2025

ZubinGou / math-evaluation-harness

A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨

Python 267 20 Updated Apr 26, 2024

runpod / containers

🐳 | Dockerfiles for the RunPod container images used for our official templates.

Jupyter Notebook 211 114 Updated Nov 14, 2025

facebookresearch / chai

CHAI is a library for dynamic pruning of attention heads for efficient LLM inference.

Python 22 Updated Dec 11, 2024

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 17,685 2,411 Updated Nov 27, 2025

Cornell-RelaxML / quip-sharp

Python 569 49 Updated Oct 29, 2024

karpathy / deep-vector-quantization

VQVAEs, GumbelSoftmaxes and friends

Jupyter Notebook 617 50 Updated Nov 20, 2021

HugoZHL / PQCache

[SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference

Python 77 16 Updated Nov 8, 2025

Infini-AI-Lab / MagicDec

[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding

Python 132 9 Updated Dec 4, 2024

AIoT-MLSys-Lab / SVD-LLM

[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2

Python 264 32 Updated Aug 28, 2025

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,769 866 Updated Jun 10, 2024

microsoft / VPTQ

VPTQ, A Flexible and Extreme low-bit quantization algorithm

Python 667 49 Updated Apr 25, 2025

HuangOwen / Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

1,720 111 Updated Nov 10, 2025