chenkaiidy

Follow

kevinchen.ck chenkaiidy

Follow

interesting in Nvidia GPU performance optimization

5 followers · 7 following

aliyun
hangzhou

Stars

eunomia-bpf / bpftime

Userspace eBPF runtime for Observability, Network, GPU & General Extensions Framework

C++ 1,250 134 Updated Nov 24, 2025

open-neutrino / neutrino

C 209 18 Updated Aug 4, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 4,012 325 Updated Nov 25, 2025

meta-recsys / generative-recommenders

Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Python 1,565 306 Updated Nov 22, 2025

microsoft / agent-lightning

The absolute trainer to light up AI agents.

Python 8,863 708 Updated Nov 25, 2025

thinkwee / AgentsMeetRL

Awesome List for Agentic RL

HTML 553 18 Updated Nov 9, 2025

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 95,357 26,009 Updated Nov 25, 2025

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,179 3,211 Updated Nov 24, 2025

microsoft / RD-Agent

Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…

Python 9,572 1,029 Updated Nov 25, 2025

imbue-ai / cluster-health

Python 316 41 Updated Aug 20, 2024

SwanHubX / SwanLab

⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / veRL/ Swift / Ultra…

Python 3,145 169 Updated Nov 18, 2025

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,352 157 Updated Nov 25, 2025

shen-shanshan / cs-self-learning

This repo is used for archiving my notes, codes and materials of cs learning.

Jupyter Notebook 63 2 Updated Nov 25, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,311 438 Updated Nov 25, 2025

google-deepmind / android_env

RL research on Android devices.

Python 1,156 100 Updated Nov 25, 2025

google-research / android_world

AndroidWorld is an environment and benchmark for autonomous agents

Python 519 110 Updated Nov 24, 2025

agentica-project / verl-pipeline

Async pipelined version of Verl

Python 125 13 Updated Apr 8, 2025

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,776 452 Updated Nov 23, 2025

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 1,247 107 Updated Nov 18, 2025

kubewharf / podseidon

A multi-cluster pod deletion protection webhook with high scalability and disaster tolerance

Go 39 4 Updated Nov 20, 2025

kubewharf / kubeadmiral

Multi-Cluster Kubernetes Orchestration

Go 913 97 Updated Aug 11, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 4,272 259 Updated Nov 22, 2025

budtmo / docker-android

Android in docker solution with noVNC supported and video recording

Python 13,571 1,577 Updated Nov 21, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 63,930 11,524 Updated Nov 25, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 5,341 537 Updated Nov 21, 2025

mll-lab-nu / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Jupyter Notebook 2,411 186 Updated Nov 25, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,757 1,003 Updated Nov 25, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,587 2,645 Updated Nov 25, 2025

learning-at-home / hivemind

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

Python 2,287 205 Updated Oct 12, 2025

PrimeIntellect-ai / OpenDiloco

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

Python 548 46 Updated Jan 13, 2025