-
Tiktok
- Singapore
Lists (1)
Sort Name ascending (A-Z)
Stars
💫 Toolkit to help you get started with Spec-Driven Development
Recipes for reproducing training and serving benchmarks for large machine learning models using GPUs on Google Cloud.
Debug the intermediate outputs of two models.
An Adaptive Pencil Decomposition Library for NVIDIA GPUs
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning
A configuration framework that enhances Claude Code with specialized commands, cognitive personas, and development methodologies.
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!
MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
A generative speech model for daily dialogue.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…
A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.
Distributed Compiler based on Triton for Parallel Systems
Invert scroll direction for physical scroll wheels while maintaining "Natural" scrolling for trackpads on MacOS
A collection of reproducible inference engine benchmarks
ByteCheckpoint: An Unified Checkpointing Library for LFMs
A Datacenter Scale Distributed Inference Serving Framework
The complete stack for AI Engineers: framework, runtime and control plane.
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
📄 Configuration files that enhance Cursor AI editor experience with custom rules and behaviors
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
Tile primitives for speedy kernels