-
Ph.D. Candidate@CUHK-MMLab, B.E.@ UCAS
- HongKong
- https://jf-d.github.io/
Lists (1)
Sort Name ascending (A-Z)
Stars
MiroThinker is an open-source search agent model, built for tool-augmented reasoning and real-world information seeking, aiming to match the deep research experience of OpenAI Deep Research and Gem…
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
My learning notes for ML SYS.
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
ScreenCoder — Turn any UI screenshot into clean, editable HTML/CSS with full control. Fast, accurate, and easy to customize.
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
[arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
DeepSeek-V3/R1 inference performance simulator
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient Multi-head Latent Attention Kernels
MLGym A New Framework and Benchmark for Advancing AI Research Agents
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A bibliography and survey of the papers surrounding o1
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
FlexFlow Serve: Low-Latency, High-Performance LLM Serving
A unified inference and post-training framework for accelerated video generation.
[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
verl: Volcano Engine Reinforcement Learning for LLMs