-
ByteDance
- Shanghai, China
-
05:58
(UTC +08:00) - https://fangjiarui.github.io/
- https://www.zhihu.com/people/feifeibear
- in/fangjiarui
Lists (5)
Sort Name ascending (A-Z)
Stars
Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support
Post-training with Tinker
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
Render any git repo into a single static HTML page for humans or LLMs
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
(best/better) practices of megatron on veRL and tuning guide
slime is an LLM post-training framework for RL Scaling.
🤗A PyTorch-native Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.
NoakLiu / FastCache-xDiT
Forked from xdit-project/xDiTFastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
verl: Volcano Engine Reinforcement Learning for LLMs
MAGI-1: Autoregressive Video Generation at Scale
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
This package contains the original 2012 AlexNet code.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Analyze computation-communication overlap in V3/R1.
A lightweight data processing framework built on DuckDB and 3FS.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Wan: Open and Advanced Large-Scale Video Generative Models
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient Multi-head Latent Attention Kernels