Lists (13)
Sort Name ascending (A-Z)
Starred repositories
Out-of-the-box DeepSeek OCR document parsing Web Studio
A learning project for building local knowledge bases from PDFs using LangChain, supporting multiple LLMs (DeepSeek, OpenAI). Features include PDF processing, knowledge graph construction, and natu…
SGLang is a fast serving framework for large language models and vision language models.
Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Large Language Models.
Tyk Open Source API Gateway written in Go, supporting REST, GraphQL, TCP and gRPC protocols
GPU cluster manager for optimized AI model deployment
Supercharge Your LLM with the Fastest KV Cache Layer
Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and yo…
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
The ultimate LLM/AI application development framework in Golang.
Fully open reproduction of DeepSeek-R1
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
neuralmagic / nm-vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
🎆Interactive Online Platform that Visualizes Algorithms from Code
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
HunyuanVideo: A Systematic Framework For Large Video Generation Model
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
A large-scale simulation framework for LLM inference
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
Convert any text to a graph of knowledge. This can be used for Graph Augmented Generation or Knowledge Graph based QnA
A low-latency & high-throughput serving engine for LLMs
MS-Agent: Lightweight Framework for Empowering Agents with Autonomous Exploration in Complex Task Scenarios