Stars
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Reference PyTorch implementation and models for DINOv3
[CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Code release for ActionFormer (ECCV 2022)
这是一个简单的技术科普教程项目,主要聚焦于解释一些有趣的,前沿的技术概念和原理。每篇文章都力求在 5 分钟内阅读完成。
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Solve Visual Understanding with Reinforced VLMs
A collection of awesome image inpainting studies.
PyTorch code and models for the DINOv2 self-supervised learning method.
Official repository for "AM-RADIO: Reduce All Domains Into One"
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024