-
Shanghai Jiao Tong University
- Seattle, WA ⇌ Shanghai, China
-
21:18
(UTC -08:00) - https://conless.dev/
- @conlesspan
Highlights
- Pro
Stars
Improved build system generator for CPython C, C++, Cython and Fortran extensions
Open-source implementation of AlphaEvolve
Perplexity open source garden for inference technology
A language-model–powered compressor for natural language text
Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
Distributed Compiler based on Triton for Parallel Systems
RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
KV cache store for distributed LLM inference
Repo for OSDI 2023 paper: "Ship your Critical Section Not Your Data: Enabling Transparent Delegation with TCLocks"
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
`std::execution`, the proposed C++ framework for asynchronous and parallel programming.
WaferLLM: Large Language Model Inference at Wafer Scale
[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A bibliography of papers related to symbolic execution
Curated collection of papers in MoE model inference