Skip to content
View Conless's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report Conless

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Improved build system generator for CPython C, C++, Cython and Fortran extensions

Python 527 124 Updated Nov 24, 2025

Open-source implementation of AlphaEvolve

Python 4,651 701 Updated Nov 27, 2025

Perplexity open source garden for inference technology

Rust 274 20 Updated Nov 20, 2025
Jinja 12 1 Updated Oct 24, 2025

A language-model–powered compressor for natural language text

Python 46 2 Updated Oct 23, 2025

Pie: Programmable LLM Serving

Rust 72 11 Updated Nov 24, 2025

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 405 10 Updated Nov 27, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,251 107 Updated Nov 18, 2025

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 4,773 472 Updated Nov 26, 2025

KV cache store for distributed LLM inference

C++ 368 31 Updated Nov 13, 2025

Repo for OSDI 2023 paper: "Ship your Critical Section Not Your Data: Enabling Transparent Delegation with TCLocks"

C 21 3 Updated Nov 6, 2024

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 498 112 Updated Nov 27, 2025

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)

C++ 1,099 102 Updated Nov 27, 2025

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

Python 679 66 Updated Nov 27, 2025

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 2,109 214 Updated Nov 28, 2025

WaferLLM: Large Language Model Inference at Wafer Scale

Python 75 11 Updated Oct 31, 2025

[NeurIPS 2025] Radial Attention: O(nlogn) Sparse Attention with Energy Decay for Long Video Generation

Python 560 29 Updated Nov 11, 2025

Tutorials for NVIDIA CUPTI samples

C++ 40 8 Updated Nov 3, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 975 67 Updated Nov 25, 2025

Gemfield的文章

Python 7 1 Updated Mar 16, 2025

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,281 177 Updated Aug 19, 2025

Record and Replay Framework

C++ 10,259 641 Updated Nov 14, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,030 327 Updated Nov 27, 2025

A bibliography of papers related to symbolic execution

290 58 Updated Aug 12, 2016

Curated collection of papers in MoE model inference

308 11 Updated Oct 20, 2025
Next