Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Multilingual Document Layout Parsing in a Single Vision-Language Model
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Implement a reasoning LLM in PyTorch from scratch, step by step
Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
The dataset for drone based detection and tracking is released, including both image/video, and annotations.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Reference PyTorch implementation and models for DINOv3
PyTorch code and models for the DINOv2 self-supervised learning method.
[ECCV2020] A Large-Scale Face Anti-Spoofing Dataset
The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, PyTorch, and Hugging Face libraries.
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
Robust Python implementation for detecting blurry images using ROI estimation and DCT analysis.
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
虚拟爱抖露(アイドル)共享计划, 是基于单目RGB摄像头的人眼与人脸特征点检测算法, 在实时3D面部捕捉以及模型驱动领域的应用.
Thư viện chuẩn hóa văn bản Tiếng Việt
VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models
A comprehensive benchmark of deepfake detection
Python library and CLI tool to interface with Google Translate's text-to-speech API
Virtual whiteboard for sketching hand-drawn like diagrams