Stars
实时交互数字人,可自定义形象与音色,支持音色克隆,对话延迟低至3s。Real-time voice interactive digital human, customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.
Android 混合推送SDK,快速集成6个厂商推送,共享系统推送通道,杀死也能收到推送,推送到达率90%以上
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
🚀 Truly open-source AI avatar(digital human) toolkit for offline video generation and digital human cloning.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
No fortress, purely open ground. OpenManus is Coming.
Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.
SymmetricDS is database replication and file synchronization software that is platform independent, web enabled, and database agnostic. It is designed to make bi-directional data replication fast, …
Ip2region is an offline IP address manager framework and locator with both IPv4 and IPv6 supported, supporting billions of data segments, ten microsecond searching performance, xdb search client fo…
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
🚀 The best real-time interactive AI avatar(digital human) with on-premise deployment and <1.5 s latency.
English-Japanese Dictionary data (Public Domain) EJDict-hand
JMdict, JMnedict, KANJIDIC for Yomitan/Yomichan.
The Java server library for the App Store Server API and App Store Server Notifications.
Flutter video player plugin for all desktop+mobile platforms. download prebuilt examples from github actions. https://pub.dev/packages/fvp
A Flutter plugin that exposes device specific text to speech recognition capability.
Flutter App That Can Transcribe Audio Offline/On Device with Whisper C++ Bindings via Rust
Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.
Code release for NeRF (Neural Radiance Fields)
Collection of AI-related utilities. Welcome to submit issues and pull requests /收藏AI相关的实用工具,欢迎提交issues 或者pull requests
JonathanFly / bark
Forked from suno-ai/bark🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model