Skip to content
View wusongyuan's full-sized avatar

Block or report wusongyuan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.

Python 1,116 105 Updated Nov 7, 2025

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,304 325 Updated Nov 7, 2025

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 63,168 9,283 Updated Nov 6, 2025

The best ChatGPT that $100 can buy.

Python 36,207 4,246 Updated Nov 5, 2025

A Robust Approach for LiDAR-Inertial Odometry Without Sensor-Specific Modelling

C++ 335 19 Updated Nov 5, 2025

[ICRA 2025] Interactive4D: Interactive 4D LiDAR Segmentation

Python 96 6 Updated May 7, 2025

The Most Faithful Implementation of Segment Anything (SAM) in 3D

Python 349 16 Updated Sep 11, 2024

BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation,…

TypeScript 9,906 1,630 Updated Nov 7, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,263 2,455 Updated Nov 9, 2025

🚀 The fast, Pythonic way to build MCP servers and clients

Python 20,110 1,479 Updated Nov 9, 2025

Visual Studio Code

TypeScript 178,391 36,047 Updated Nov 9, 2025

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,246 100 Updated Oct 29, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,706 979 Updated Nov 6, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,930 286 Updated May 15, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,855 897 Updated Sep 30, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,869 739 Updated Oct 15, 2025

Deep Learning model implementation for Fire detection both classification and segmentation from the FLAME dataset.

Python 27 2 Updated Dec 12, 2022

Open source alternative to Gemini Deep Research. Generate reports with AI based on search results.

TypeScript 2,094 198 Updated Mar 15, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 15,506 1,119 Updated Nov 8, 2025

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 114,700 15,993 Updated Nov 9, 2025

Fully open reproduction of DeepSeek-R1

Python 25,619 2,401 Updated Sep 8, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 23,811 2,043 Updated Sep 12, 2025

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,199 1,664 Updated Sep 24, 2025

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

TypeScript 4,731 420 Updated Oct 30, 2025

NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, con…

Python 2,760 273 Updated Nov 8, 2025

Official Repo For "Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos"

Python 1,392 97 Updated Nov 4, 2025

KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…

Python 8,167 620 Updated Sep 22, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,242 420 Updated Nov 9, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 16,140 1,287 Updated Oct 27, 2025
Next