Skip to content
View ylzz1997's full-sized avatar

Block or report ylzz1997

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Perceptual Quality Estimator for speech and audio

C++ 850 139 Updated May 17, 2025

Official Repository of Paper: "Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios"(AAAI 2026)

Python 65 4 Updated Dec 27, 2025

[mirror] Go Tools

Go 7,851 2,352 Updated Dec 27, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 34,450 3,327 Updated Dec 29, 2025

Public repository of the Micro QuickJS Javascript Engine

C 4,710 164 Updated Dec 29, 2025

Public repository of the QuickJS Javascript Engine.

C 10,189 1,061 Updated Dec 22, 2025

SoFlow: Solution Flow Models for One-Step Generative Modeling

Python 94 4 Updated Dec 22, 2025

The official implementation of HierSpeech++

Python 1,240 150 Updated Feb 20, 2024

Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"

Python 541 50 Updated Nov 4, 2025

A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)

Python 36 3 Updated Dec 24, 2025

Unsupervised Speech Decomposition Via Triple Information Bottleneck

Python 696 96 Updated Oct 23, 2024

Fun-Audio-Chat is a Large Audio Language Model built for natural, low-latency voice interactions.

Python 459 42 Updated Dec 25, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,628 7,837 Updated Dec 29, 2025

[Paper][AAAI 2025] (MyGO)Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation

Python 271 8 Updated Jul 28, 2025

✨ Agentic IM ChatBot Infrastructure — 聊天智能体基础设施 ✨ 多消息平台集成(QQ / Telegram / 企微 / 飞书 / 钉钉等),强大易用的插件系统,支持 OpenAI / Gemini / Anthropic / Dify / Coze / 阿里云百炼 / 知识库 / Agent 智能体

Python 14,592 1,141 Updated Dec 29, 2025
TypeScript 8,243 545 Updated Dec 22, 2025

GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning

Python 810 100 Updated Dec 17, 2025

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,644 204 Updated Dec 23, 2025

Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM

Python 243 25 Updated Dec 23, 2025

AllenAI's post-training codebase

Python 3,481 477 Updated Dec 29, 2025

A meta-language for Go that adds Result types, error propagation (?), and pattern matching while maintaining 100% Go ecosystem compatibility

Go 1,372 25 Updated Dec 12, 2025

The Custom Go programming language for scientific/mathematics computing !!!!

Go 67 Updated Dec 28, 2025

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 4,035 582 Updated Apr 24, 2024

Multimodal speech quality QA system that turns objective assessment into a natural-language task using audio encoders (AST/Whisper) and a LLaMA-based “quality expert” to predict MOS, dimension-wise…

Python 2 Updated Dec 16, 2025

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

Python 630 57 Updated Dec 29, 2025
Python 569 57 Updated Sep 23, 2025

給你一個更好、更安全的 rm 命令

Shell 243 12 Updated Dec 9, 2025

An Open-Ended Embodied Agent with Large Language Models

JavaScript 6,561 627 Updated Apr 3, 2024
Python 137 29 Updated Jun 30, 2015
Next