Stars
Must-read Papers on Knowledge Editing for Large Language Models.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Solve Visual Understanding with Reinforced VLMs
Witness the aha moment of VLM with less than $3.
OpenMMLab Pose Estimation Toolbox and Benchmark.
COCO API - Dataset @ http://cocodataset.org/
[TMLR 2025] Efficient Reasoning Models: A Survey
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.
Collection of AWESOME vision-language models for vision tasks
Review in Deep Learning for Polyp Detection and Classification in Colonoscopy (https://doi.org/10.1016/j.neucom.2020.02.123).
A Survey on Benchmarks of Multimodal Large Language Models
✨✨Latest Advances on Multimodal Large Language Models
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
[ICCV'2025] LawDIS: Language-Window-based Controllable Dichotomous Image Segmentation
An open access book on scientific visualization using python and matplotlib
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Trichomonas Vaginalis Segmentation in Microscope Images, MICCAI2022
Frontiers in Intelligent Colonoscopy [ColonSurvey | ColonINST | ColonGPT]
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.