- Santa Clara
-
02:08
(UTC -12:00) - https://dragonlong.github.io/
- @lxiaol9
Lists (1)
Sort Name ascending (A-Z)
Stars
Simulation benchmark from Toyota Research Institute containing 49 tasks that measure the performance of Large Behavior Model policies
Post-training scripts and samples for NVIDIA Cosmos ecosystem
Collection of step-by-step playbooks for setting up AI/ML workloads on NVIDIA DGX Spark devices with Blackwell architecture.
[IROS 2025 Award Finalist] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Enjoy the magic of Diffusion models!
🌀 R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)
The official Python SDK for Model Context Protocol servers and clients
Efficient Part-level 3D Object Generation via Dual Volume Packing
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.
ICML 2025 Hypo3D: Exploring Hypothetical Reasoning in 3D
Python package to create manipulation scenes.
[NeurIPS 2025 Spotlight] SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Train your Agent model via our easy and efficient framework
Reference PyTorch implementation and models for DINOv3
Wan: Open and Advanced Large-Scale Video Generative Models
The absolute trainer to light up AI agents.
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Prompting
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
Train transformer language models with reinforcement learning.