-
NTU
- Singapore
Lists (2)
Sort Name ascending (A-Z)
Stars
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A curated list of resources for articulated objects understanding.
[CVPR'24] Consistent Novel View Synthesis without 3D Representation
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Democratization of RT-2 "RT-2: New model translates vision and language into action"
RynnEC: Bringing MLLMs into Embodied World
[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
Example models using DeepSpeed
一个手把手教你从零开始编写GPT并训练大语言模型的教程
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-View Images
[CVPR 2024 Oral, Best Paper Runner-Up] Code for "pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction" by David Charatan, Sizhe Lester Li, Andrea Tagliasacch…
[ICLR 2025] HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction
Segment Anything in 3D with NeRFs (NeurIPS 2023 & IJCV 2025)
[IROS 2025] LiDAR-Augmented Gaussian Splatting and Neural SDF for Geometrically Consistent Rendering and Reconstruction
Official implementation of BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting
[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction