Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Official implementation for "Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter"
Generalised Contrastive Learning. This is a Repository for Google Shopping Dataset and Benchmarks followed by our novel fine-grained contrastive learning framework.
Target Refocusing via Attention Redistribution for Open-Vocabulary Semantic Segmentation: An Explainability Perspective (AAAI 2026)
Official implementation of the "Multimodal Parameter-Efficient Few-Shot Class Incremental Learning" paper
Awsome of VLM-CL. Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
[ICCV'25] Unified Open-World Segmentation with Multi-Modal Prompts
Central repository for biomolecular foundation models with shared trainers and pipeline components
Code for β -CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment
Official implementation of "Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation".
Official repository of "Multi-view Pyramid Transformer: Look Coarser to See Broader"
[NeurIPS 2025] AutoSeg3D, online real-time 3D segmentation as instance tracking with long-short term query memory for embodied perception
[AAAI 2025] Official codes of "ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models".
A training-free, mask-free framework for 3D shape editing.
offical repository of LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”
[AAAI 2026] Diffusion-Based Contextual Reconstruction for Point Cloud Segmentation with Limited Annotations
😎 A curated list of CVPR 2025 Oral paper. Total 96
[Arxiv] Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration
ITS3D: Inference-Time Scaling for Text-Guided 3D Diffusion Models
📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"
CauSight: Learning to Supersense for Visual Causal Discovery
Official implementation of "CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models"
[NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"
[AAAI'26] BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection
AAAI2026 Multimodal Robust Prompt Distillation for 3D Point Cloud Models