YS-IMTech

Follow

Shuai Yang YS-IMTech

Follow

Ph.D. student @ SJTU & Shanghai AI Lab

66 followers · 77 following

Shanghai Jiao Tong Univesity
Shanghai
@yangshuai1227
https://YS-IMTech.github.io

Achievements

Achievements

Starred repositories

facebookresearch / sam-3d-objects

SAM 3D Objects

Python 3,466 224 Updated Nov 21, 2025

cambrian-mllm / cambrian-s

Cambrian-S: Towards Spatial Supersensing in Video

Python 383 9 Updated Nov 10, 2025

bobeff / open-source-games

A list of open source games.

8,059 617 Updated Nov 20, 2025

ByteDance-Seed / Depth-Anything-3

Depth Anything 3

Jupyter Notebook 2,704 188 Updated Nov 20, 2025

svg-project / Sparse-VideoGen

[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention

Python 583 30 Updated Nov 18, 2025

nv-tlabs / ChronoEdit

ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation

Python 602 33 Updated Nov 20, 2025

tongjingqi / Thinking-with-Video

We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…

Python 204 4 Updated Nov 24, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,278 44 Updated Nov 19, 2025

krea-ai / realtime-video

Krea Realtime 14B. An open-source realtime AI video model.

Python 392 22 Updated Nov 13, 2025

meituan-longcat / LongCat-Video

Python 1,211 126 Updated Nov 4, 2025

KangLiao929 / Puffin

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Python 344 12 Updated Oct 27, 2025

AutoLab-SAI-SJTU / AutoPage

This is the official implementation for Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1.

HTML 142 10 Updated Oct 27, 2025

EzioBy / Ditto

[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Python 497 41 Updated Oct 29, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 34,385 4,022 Updated Nov 19, 2025

SOTAMak1r / Infinite-Forcing

Forked from guandeh17/Self-Forcing

Infinite-Forcing: Towards Infinite-Long Video Generation

Python 92 2 Updated Nov 13, 2025

facebookresearch / dinov3

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 8,476 599 Updated Nov 20, 2025

daydreamlive / scope

A tool for running and customizing real-time, interactive generative AI pipelines and models

Python 78 14 Updated Nov 24, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,581 45 Updated Nov 15, 2025

Robert-gyj / Ctrl-World

Ctrl-World: A Controllable Generative World Model for Robot Manipualtion

Python 183 12 Updated Oct 25, 2025

thuml / MiniVeo3-Reasoner

Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.

Python 184 6 Updated Oct 12, 2025

Inception3D / TTT3R

A simple state update rule to enhance length generalization for CUT3R

Python 515 14 Updated Oct 1, 2025

ByteDance-Seed / AHN

AHN: Artificial Hippocampus Networks for Efficient Long-Context Modeling

Python 143 5 Updated Oct 17, 2025

Eyeline-Labs / VChain

The official implementation of paper “VChain: Chain-of-Visual-Thought for Reasoning in Video Generation”

102 1 Updated Oct 7, 2025

TencentARC / RollingForcing

Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time

Python 247 9 Updated Oct 31, 2025

NVlabs / LongLive

LongLive: Real-time Interactive Long Video Generation

Python 831 55 Updated Nov 3, 2025

InternLM / CapRL

An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"

Python 145 6 Updated Nov 5, 2025

AlmondGod / tinyworlds

A minimal implementation of DeepMind's Genie world model

Python 1,033 76 Updated Nov 22, 2025

InternLM / SIM-CoT

An official implementation of "SIM-CoT: Supervised Implicit Chain-of-Thought"

Python 100 3 Updated Sep 28, 2025

nv-tlabs / lyra

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Python 592 32 Updated Oct 2, 2025

Alibaba-NLP / DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,310 1,321 Updated Nov 20, 2025

Starred topics

monocular-depth-estimation