Lists (4)
Sort Name ascending (A-Z)
Starred repositories
PKU-DAIR / Hetu-Galvatron
Forked from AFDWang/Hetu-GalvatronGalvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
DRAMsim3: a Cycle-accurate, Thermal-Capable DRAM Simulator
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
Artifact for paper "PIM is All You Need: A CXL-Enabled GPU-Free System for LLM Inference", ASPLOS 2025
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
FlashInfer: Kernel Library for LLM Serving
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
Open source process design kit for usage with SkyWater Technology Foundry's 130nm node.
Open source process design kit for 28nm open process
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
An Open Workflow to Build Custom SoCs and run Deep Models at the Edge
PIMSim is a Process-In-Memory Simulator with the compatibility of GEM5 full-system simulation.
An analytical cost model evaluating DNN mappings (dataflows and tiling).