Highlights
- Pro
Stars
Project page for "MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition"
Mapping Mediapipe's 52 blendshapes to FLAME's expression coefficients and poses.
Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various r…
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
[EMNLP2024 Demo], [ICASSP 2025] A user-friendly library for reproducible video moment retrieval and highlight detection. It also supports audio moment retrieval.
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
[NeurIPS 2020] Official code for the paper "DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation". Includes a PyTorch library for deep learning with SVG data.
An integrated Japanese analyzer based on foundation models
Library to build speech synthesis systems designed for easy and fast prototyping.
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
This is a repository of YACIS corpus and information of how to obtain the whole corpus as well as its annotations.
Neural network-based singing voice synthesis library for research
Robust Speech Recognition via Large-Scale Weak Supervision
context labels and pronunciation data for JSUT corpus
Bring projects, wikis, and teams together with AI. AppFlowy is the AI collaborative workspace where you achieve more without losing control of your data. The leading open source Notion alternative.
Google AI 2018 BERT pytorch implementation
Siamese and triplet networks with online pair/triplet mining in PyTorch
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"
This is a deep learning project on Manga109 dataset by using Yolov3
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)