gszfwsb

Shaobo (Steven) Wang gszfwsb

Ph.D Candidate at @EPIC-Lab-sjtu. Previous Master Student @Thinklab-SJTU. Feel free to contact me or teach me.

185 followers · 702 following

Shanghai Jiao Tong University
Shanghai, China
03:06 (UTC +08:00)
gszfwsb.github.io
@ShaoboWang6

Achievements

Highlights

Lists (11)

Sort

Starred repositories

Tencent-Hunyuan / HunyuanOCR

Python 665 35 Updated Nov 27, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 12,492 1,152 Updated Sep 26, 2025

arcee-ai / mergekit

Tools for merging pretrained large language models.

Python 6,494 638 Updated Nov 27, 2025

allenai / dolma3

Jupyter Notebook 17 3 Updated Nov 20, 2025

YaoMarkMu / Awesome-Pretrained-RL

91 2 Updated Sep 21, 2022

qiuzh20 / gated_attention

The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Jupyter Notebook 156 7 Updated Sep 19, 2025

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 10,814 1,015 Updated Nov 27, 2025

maomaocun / dLLM-Var

The official implementation of dLLM-Var

Python 24 Updated Nov 6, 2025

facebookresearch / PhysicsLM4

Physics of Language Models, Part 4

HTML 261 13 Updated Jul 29, 2025

Alibaba-Quark / SSP

Search Self-Play: Pushing the Frontier of Agent Capability without Supervision

Python 63 4 Updated Nov 13, 2025

SkyworkAI / Skywork-Reward-V2

Scaling Preference Data Curation via Human-AI Synergy

130 1 Updated Jul 3, 2025

facebookresearch / Group-MATES

Code repository for Group-MATES Group-Level Data Selection for Efficient Pretraining

Python 7 2 Updated Jun 14, 2025

apple / pico-banana-400k

Python 1,703 76 Updated Oct 28, 2025

gszfwsb / Socratic-Zero

Forked from Frostlinx/Socratic-Zero

Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning

Python 3 Updated Oct 27, 2025

aakaran / reasoning-with-sampling

Python 329 43 Updated Nov 7, 2025

AutoLab-SAI-SJTU / AutoPage

This is the official implementation for Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1.

HTML 144 11 Updated Oct 27, 2025

nota-github / ERGO

ERGO (Efficient Reasoning & Guided Observation) is a large vision–language model trained with reinforcement learning on efficiency objectives.

Python 10 Updated Oct 2, 2025

LeslieTrue / SFTvsRL

Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Python 310 17 Updated Apr 28, 2025

speedyapply / 2026-AI-College-Jobs

2026 AI/ML internship & new graduate job list updated daily

4,110 167 Updated Nov 27, 2025

valeman / Transformers_And_LLM_Are_What_You_Dont_Need

The best repository showing why transformers might not be the answer for time series forecasting and showcasing the best SOTA non transformer models.

802 57 Updated Nov 14, 2025

TsinghuaC3I / Unify-Post-Training

Towards a Unified View of Large Language Model Post-Training

Python 187 11 Updated Sep 8, 2025

YsTvT / Awesome-Agentic-RL-Papers

94 7 Updated Oct 22, 2025

EvoAgentX / Awesome-Self-Evolving-Agents

[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

1,353 87 Updated Oct 11, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,592 45 Updated Nov 15, 2025

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,392 162 Updated Nov 27, 2025

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 37,658 4,614 Updated Nov 17, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,570 302 Updated Nov 13, 2025

NVlabs / RLP

RLP: Reinforcement as a Pretraining Objective

201 13 Updated Oct 5, 2025

TIGER-AI-Lab / verl-tool

A version of verl to support diverse tool use

Python 711 58 Updated Nov 25, 2025

seal-rg / recurrent-pretraining

Pretraining and inference code for a large-scale depth-recurrent language model

Python 849 71 Updated Oct 16, 2025

Shaobo (Steven) Wang gszfwsb

Highlights

Lists (11)

attribution

benchmarks

DD

diffusion

foundation model

🔮 Future ideas

ICL

✨ Inspiration

🚀 My stack

optimzation

pretrain

Starred repositories

tiny-imagenet

Ubuntu

Jekyll

Machine learning

Python

Jupyter Notebook

Algorithm