Stars
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)
✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension"
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
GeoIntel using Google's Gemini API to uncover the location where photos were taken through AI-powered geo-location analysis.
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
✨✨Latest Advances on Multimodal Large Language Models
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
[NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
OpenMMLab Detection Toolbox and Benchmark
课堂专注度及考试作弊系统、课堂动态点名。情绪识别、表情识别、姿态识别和人脸识别结合
🏨TopView工作室一轮考核项目:一个酒店管理系统,提供查看房间,对房间进行模糊查询,预订房间,个人信息管理,房间和酒店信息管理(管理员)等功能,后台使用Java,tomcat,mysql,servlet,jsp实现,没有使用任何框架
这是一个基于ssm框架和mysql数据库开发的一个酒店管理系统