Skip to content
View li-ronghui's full-sized avatar

Block or report li-ronghui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Open-source Autonomous 3D Characters on the Web

TypeScript 32 3 Updated Nov 10, 2025

official implementation of [PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning, ICCV'25]

Python 9 2 Updated Oct 31, 2025

the Quest for Generalizable Motion Generation: Data, Model, and Evaluation

16 Updated Oct 31, 2025

[CVPR 2025 Highlight] SkillMimic: Learning Basketball Interaction Skills from Demonstrations

Python 351 27 Updated Jun 30, 2025

A fast multimodal LLM for real-time voice

Python 4,254 345 Updated Sep 2, 2025

Training, validation, and inference code for various SSL approaches and architectures.

Python 67 1 Updated Oct 22, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9,087 827 Updated Nov 3, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,094 215 Updated May 19, 2025

(NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis

Python 109 6 Updated Nov 8, 2025

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,442 178 Updated Mar 28, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 4,705 308 Updated Nov 11, 2025

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

Python 1,162 138 Updated Jan 30, 2025

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程

Jupyter Notebook 25,898 2,605 Updated Nov 10, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,326 7,542 Updated Nov 12, 2025

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 858 70 Updated Aug 27, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,030 2,091 Updated Nov 12, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 9,463 732 Updated Sep 22, 2025

✨✨Latest Advances on Multimodal Large Language Models

16,676 1,075 Updated Nov 12, 2025

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,782 297 Updated Jun 12, 2025

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,693 1,164 Updated Nov 14, 2024

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,215 89 Updated Sep 22, 2025

Implementation of Autoregressive Diffusion in Pytorch

Python 416 11 Updated Nov 3, 2024
Python 11 Updated Sep 29, 2024

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,704 251 Updated Sep 25, 2025

Code to reproduce the results for our SIGGRAPH 2023 paper "Listen Denoise Action"

Python 177 27 Updated Sep 20, 2023

Official Dataset Toolbox of the paper "[CVPR 2023]NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions" and "[CVPR2024]HOI-M3: Capture Multiple Humans and Objects Interact…

Python 67 7 Updated Aug 13, 2024

Virtual Community: An Open World for Humans, Robots, and Society

Python 176 10 Updated Oct 26, 2025
Next