Skip to content
View bys0318's full-sized avatar

Organizations

@THU-KEG @THUDM

Block or report bys0318

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ongoing research training transformer models at scale

Python 14,883 3,482 Updated Jan 13, 2026

Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"

Python 550 50 Updated Nov 4, 2025

🌿 DeepPrune: Parallel Scaling without Inter-trace Redundancy

Python 18 Updated Oct 10, 2025
Python 81 6 Updated Dec 2, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 4,643 392 Updated Jan 13, 2026

[SIGGRAPH Asia 2025] CHARM: Control-point-based 3D Anime Hairstyle Auto-Regressive Modeling

Python 39 3 Updated Sep 26, 2025

🚀🚀 Efficient implementations of Native Sparse Attention

Python 1,044 12 Updated Sep 29, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 4,135 244 Updated Dec 15, 2025

GLM-SIMPLE-EVALS: The evaluation repository for the GLM-4.5 series of models by Z.ai.

Python 35 6 Updated Oct 17, 2025

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Python 3,766 387 Updated Dec 23, 2025

[COLM 2024] A Survey on Deep Learning for Theorem Proving

213 16 Updated May 28, 2025
Python 39 2 Updated Dec 27, 2025

An efficient implementation of the NSA (Native Sparse Attention) kernel

Python 128 4 Updated Jun 24, 2025
Python 711 17 Updated Nov 20, 2025

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 27,968 5,331 Updated Oct 9, 2025

Scaling RL on advanced reasoning models

Python 657 40 Updated Oct 20, 2025

slime is an LLM post-training framework for RL Scaling.

Python 3,307 413 Updated Jan 13, 2026
Python 32 2 Updated Jun 5, 2025

[SIGGRAPH 2025] PrimitiveAnything: Human-Crafted 3D Primitive Assembly Generation with Auto-Regressive Transformer

Python 379 16 Updated May 13, 2025
Python 19 1 Updated Jun 29, 2025

ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括335个大模型,覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.5、文心ERNIE-X1.1、ERNIE-5.0-Thinking、qwen3-max、百川、讯飞星火、商汤senseChat等商用模型, 以及kimi-k2、ernie4.5、minimax-M2、deepseek-…

5,398 217 Updated Jan 9, 2026

[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring

Python 265 21 Updated Jul 6, 2025

[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Python 119 7 Updated Jun 11, 2025

Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"

Python 87 6 Updated Feb 26, 2025

[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Python 364 12 Updated Mar 26, 2025

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Python 445 23 Updated Oct 16, 2024

[NeurIPS 2024] AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos

Python 23 1 Updated Dec 6, 2024
Next