Stars
[ICLR 2025🔥] D2O: Dynamic Discriminative Operations for Efficient Long-Context Inference of Large Language Models
Unified KV Cache Compression Methods for Auto-Regressive Models
Xiao-Ming Wu's homepage: https://dravenalg.github.io/.
This is the repository for [Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs](https://arxiv.org/html/2506.05410v1),presented at NeurIPS 2025.
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".
Collection of awesome test-time (domain/batch/instance) adaptation methods
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".
Latest Advances on Long Chain-of-Thought Reasoning
Tongyi Deep Research, the Leading Open-source Deep Research Agent
[ICLR 2025] Palu: Compressing KV-Cache with Low-Rank Projection
[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
xKV: Cross-Layer SVD for KV-Cache Compression
A framework for few-shot evaluation of language models.
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
General starter code for creative model architecture with huggingface transformer library.
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding code links.
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
An easy/swift-to-adapt PyTorch-Lighting template. 套壳模板,简单易用,稍改原来Pytorch代码,即可适配Lightning。You can translate your previous Pytorch code much easier using this template, and keep your freedom to edit a…
TensorFlow code and pre-trained models for BERT