bokesyo

Follow

🐁

On vacation

Boke Syo bokesyo

🐁

On vacation

Follow

63 followers · 94 following

@OpenBMB
Beijing
18:35 (UTC +08:00)
bokaixu.site

Achievements

Achievements

Organizations

Lists (1)

Sort

🚀 My stack

Stars

1038lab / ComfyUI-MiniCPM

A custom ComfyUI node for MiniCPM vision-language models, supporting v4, v4.5, and v4 GGUF formats, enabling high-quality image captioning and visual analysis.

Python 144 14 Updated Aug 28, 2025

stepfun-ai / Step-Audio2

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python 1,302 94 Updated Sep 22, 2025

neuphonic / neucodec

A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.

Python 140 19 Updated Oct 7, 2025

jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,789 1,383 Updated Dec 6, 2023

OpenSQZ / MiniCPM-V-CookBook

Cook up amazing multimodal AI applications effortlessly with MiniCPM-o

Python 237 25 Updated Dec 10, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,262 204 Updated Jan 8, 2026

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 319 19 Updated Dec 11, 2025

BUTSpeechFIT / DiariZen

A toolkit for speaker diarization.

Jupyter Notebook 373 40 Updated Dec 9, 2025

LAION-AI / CLAP

Contrastive Language-Audio Pretraining

Python 1,992 202 Updated May 15, 2025

thunlp / TritonBench

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Python 109 12 Updated Jun 14, 2025

showlab / Show-o

[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,853 82 Updated Jan 8, 2026

asuni / wavelet_prosody_toolkit

Python 196 46 Updated May 3, 2024

Sphere-AI-Lab / FormalMATH-Bench

Python 74 5 Updated Jan 8, 2026

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,275 747 Updated May 31, 2024

Authenticator-Extension / Authenticator

Authenticator generates 2-Step Verification codes in your browser.

TypeScript 4,287 1,055 Updated Nov 26, 2025

edwko / OuteTTS

Interface for OuteTTS models.

Python 1,419 114 Updated Jun 21, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 10,896 1,169 Updated Apr 9, 2025

HKUDS / Auto-Deep-Research

"Your Fully-Automated Personal AI Assistant"

Python 1,348 191 Updated Oct 16, 2025

allenai / olmocr

Toolkit for linearizing PDFs for LLM datasets/training

Python 16,740 1,327 Updated Jan 13, 2026

huggingface / lerobot

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 20,924 3,501 Updated Jan 15, 2026

SIGRobotics-UIUC / LeKiwi

LeKiwi - Low-Cost Mobile Manipulator

1,144 126 Updated Jul 15, 2025

jxlpzqc / TMSpeech

腾讯会议摸鱼工具

C# 1,180 106 Updated Dec 21, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,432 338 Updated Jan 5, 2026

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 38,899 4,682 Updated Aug 19, 2024

jishengpeng / TextrolSpeech

[ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models

Python 182 5 Updated Nov 22, 2024

zhenye234 / X-Codec-2.0

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 343 48 Updated Jul 21, 2025

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 50,717 4,188 Updated Jan 14, 2026

Physical-Intelligence / openpi

Python 9,822 1,343 Updated Dec 27, 2025

hubertsiuzdak / snac

Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python 741 41 Updated Nov 19, 2024

JerryWu-code / TinyZero

Forked from Jiayi-Pan/TinyZero

Deepseek R1 zero tiny version own reproduce on two A100s.

Python 82 28 Updated Feb 1, 2025