Stars
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
[NeurIPS 2025] VADTree: Explainable Training-Free Video Anomaly Detection via Hierarchical Granularity-Aware Tree
The SAIL-VL2 series model developed by the BytedanceDouyinContent Group
[TITS 2024] You Only Look Clusters for Tiny Object Detection in Aerial Images
Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
[CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
[ICCV2025] SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning. Paper is available at https://arxiv.org/abs/2410.14987
AD-DINOv3: Enhancing DINOv3 for Zero-Shot Anomaly Detection with Anomaly-Aware Calibration
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[DEIMv2] Real Time Object Detection Meets DINOv3
The real-time Instance Segmentation Algorithm SparseInst running on TensoRT and ONNX
Pytorch and ncnn implementation of PPYOLOE、YOLOX、PPYOLO、PPYOLOv2、PicoDet and so on.
Minimal PyTorch implementation of SOLOv2.
All-in-one training for vision models (YOLO, ViTs, RT-DETR, DINOv3): pretraining, fine-tuning, distillation.
HiAD: A General Framework for High-Resolution Anomaly Detection(通用的高分辨率异常检测框架)
🚀 Do not need libtorch, pure C++ TensorRT deploys SOLOv2 etc, which can be quickly ported to NX/TX2.
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
all of the workflows of n8n i could find (also from the site itself)
Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…
21 Lessons, Get Started Building with Generative AI