-
SMART, Singapore
- Singapore
Starred repositories
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
Python package for retrieving current and historical photos from Google Street View
Download Google Street View panoramas efficiently.
The repo of Street View Image, Pose, and 3D Cities Dataset. Used in "Generic 3D Representation via Pose Estimation and Matching", ECCV16
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
Collect some World Models for Autonomous Driving (and Robotic) papers.
Interpretable time series autoregression for periodicity quantification
Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.
papers related to diffusion language models
[NeurIPS'22] Tokenized Graph Transformer (TokenGT), in PyTorch
Training framework for Large Behavioral Models
[CVPR 2023] Query-Centric Trajectory Prediction
Geometric Latent Diffusion Models for 3D Molecule Generation
Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"
CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
[NeurIPS 2024] SMART: Scalable Multi-agent Real-time Motion Generation via Next-token Prediction
A Unified Framework for scalable Vehicle Trajectory Prediction, ECCV 2024
Implementation of MapDiff: "Mask-prior-guided denoising diffusion improves inverse protein folding" in PyTorch
Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)
Recipe for a General, Powerful, Scalable Graph Transformer
Code for EMNLP22 SpaBERT: A Pretrained Language Model from Geographic Data for Geo-Entity Representation.