x2-peng

Xiaopeng Xu x2-peng

Huazhong University of Science and Technology in Electronic information engineering

8 followers · 33 following

Huazhong University of Science and Technology
Huazhong University of Science and Technology
23:44 (UTC +08:00)

Highlights

Stars

Pointcept / Concerto

[NeurIPS'25] Official repository of Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Python 401 14 Updated Nov 10, 2025

HKUDS / DeepCode

"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"

Python 9,983 1,353 Updated Nov 10, 2025

H-EmbodVis / NAUTILUS

[NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding

Python 204 19 Updated Nov 6, 2025

DepthAnything / Video-Depth-Anything

[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Python 1,539 128 Updated Oct 7, 2025

xiaomi-research / shuffle-r1

Official code repository of Shuffle-R1

Python 31 1 Updated Aug 27, 2025

H-EmbodVis / MERGE

[NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models

Python 166 12 Updated Oct 31, 2025

allenai / molmoact

Official Repository for MolmoAct

Python 248 25 Updated Oct 26, 2025

deepseek-ai / DeepSeek-OCR

Contexts Optical Compression

Python 20,099 1,490 Updated Oct 25, 2025

nianticspatial / ace-g

[ICCV 2025] ACE-G is an architecture and pre-training scheme to improve generalization for scene coordinate regression-based visual relocalization.

Python 66 1 Updated Nov 5, 2025

OpenHelix-Team / Spatial-Forcing

Official implementation of Spatial-Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model

Python 118 4 Updated Nov 2, 2025

gangweix / pixel-perfect-depth

[NeurIPS 2025] Pixel-Perfect Depth

Python 620 24 Updated Oct 13, 2025

UniLat3D / UniLat3D

UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation

Python 110 2 Updated Nov 6, 2025

Haochen-Wang409 / ross3d

[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness

Python 60 1 Updated Jul 22, 2025

Zhangwenyao1 / DreamVLA

[NeurIPS 2025] DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

Python 218 7 Updated Sep 18, 2025

CIntellifusion / GeometryForcing

Official implementation of Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Python 72 Updated Oct 21, 2025

penghao-wu / visual_jigsaw

Python 58 3 Updated Nov 5, 2025

MIV-XJTU / JanusVLN

Official implementation for "JanusVLN: Decoupling Semantics and Spatiality with Dual Implicit Memory for Vision-Language Navigation"

Python 243 7 Updated Nov 6, 2025

pkunlp-icler / FastV

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 514 20 Updated Jan 4, 2025

diankun-wu / Spatial-MLLM

Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Python 380 11 Updated Jun 22, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 16,180 1,290 Updated Nov 10, 2025

MengLcool / DeepStack-VL

[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs".

Python 66 3 Updated Jun 17, 2024

SpatialVision / Spatial-CLIP

The accepted work for cvpr2025

Python 15 Updated Aug 23, 2025

Mini-o3 / Mini-o3

Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"

Python 359 15 Updated Sep 15, 2025

HorizonWind2004 / reconstruction-alignment

Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.

Python 301 10 Updated Oct 16, 2025

cvlab-kaist / VIRAL

Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".

Python 134 5 Updated Sep 21, 2025

Lifelong-Robot-Learning / LIBERO

Benchmarking Knowledge Transfer in Lifelong Robot Learning

Jupyter Notebook 1,107 221 Updated Mar 15, 2025

saccharomycetes / mllms_know

[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'

Python 290 16 Updated Apr 20, 2025

MTU3D / MTU3D

Python 206 9 Updated Aug 6, 2025

microsoft / MoGe

[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Python 1,943 121 Updated Nov 2, 2025

YuetianW / HUST_EIC_Intro

🏷️ 华中科技大学电信学院-电信专业的课程分享与攻略

Python 256 48 Updated Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly