-
Princeton University
- Princeton, NJ
-
23:28
(UTC -05:00) - https://yinwei-dai.com
- @dai_yinwei
Highlights
- Pro
Stars
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
Safe Interactions with Foreign Languages through Omniglot
Optimizing inference proxy for LLMs
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Award, Honorable Mention]
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
Paper list for Efficient Reasoning.
[Arxiv 2025] Official code and datasets of paper: GNNs as Predictors of Agentic Workflow Performances
A Datacenter Scale Distributed Inference Serving Framework
❓Curie: Automated and Rigorous Scientific Experimentation with AI Agents
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A lightweight, powerful framework for multi-agent workflows
A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"
A general and accurate MACs / FLOPs profiler for PyTorch models
Expressive, Easy to Build, and High-Performance Application Networks
DeepEP: an efficient expert-parallel communication library
The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
SGLang is a fast serving framework for large language models and vision language models.
Curated collection of papers in machine learning systems
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
Large Language Model (LLM) Systems Paper List