Skip to content
View kyleliang919's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kyleliang919

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,829 2,544 Updated Nov 15, 2025

EvaByte: Efficient Byte-level Language Models at Scale

Python 110 8 Updated Apr 22, 2025

The best ChatGPT that $100 can buy.

Python 36,859 4,449 Updated Nov 15, 2025

If your linear layer is secretly fast weight memory, why not model fast weight memory updates as optimizers?

Python 2 Updated Oct 15, 2025

Understand and test language model architectures on synthetic tasks.

Python 238 38 Updated Sep 25, 2025

Random thoughts

HTML 1 Updated Oct 8, 2025

Ongoing research training transformer models at scale

Python 14,212 3,277 Updated Nov 17, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 49,640 8,318 Updated Nov 12, 2025

Scaling Diffusion Transformers with Mixture of Experts

Python 401 19 Updated Sep 9, 2024

Scalable toolkit for efficient model reinforcement

Python 1,030 167 Updated Nov 17, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 63,209 11,324 Updated Nov 17, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,208 1,925 Updated Nov 1, 2025

maximal update parametrization (µP)

Jupyter Notebook 1,626 104 Updated Jul 17, 2024

Local RTMP Streaming Server

JavaScript 570 93 Updated Mar 10, 2025

🔥 A minimal training framework for scaling FLA models

Python 292 48 Updated Nov 15, 2025

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 920 48 Updated Mar 19, 2025

Home page for Kaizhao

HTML 1 Updated Oct 13, 2025

NanoGPT (124M) in 3 minutes

Python 1 Updated Oct 1, 2025

Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).

Python 8,963 840 Updated Nov 17, 2025

Minimal implementation of scalable rectified flow transformers, based on SD3's approach

Jupyter Notebook 617 59 Updated Jul 1, 2024

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Python 1,891 168 Updated Oct 27, 2025

NanoGPT (124M) in 3 minutes

Python 3,824 501 Updated Nov 16, 2025

Reference implementations of MLPerf® training benchmarks

Python 1,724 585 Updated Nov 5, 2025

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 1,040 56 Updated Aug 7, 2025

PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)

Jupyter Notebook 526 31 Updated Sep 8, 2025

Modeling, training, eval, and inference code for OLMo

Python 6,124 671 Updated Oct 24, 2025

Official repository for LTX-Video

Python 8,777 810 Updated Oct 25, 2025

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

Python 547 46 Updated Jan 13, 2025
Next