Highlights
Lists (16)
Sort Name ascending (A-Z)
Stars
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
A collection of TouchDesigner components for making live-performance easier
Wan: Open and Advanced Large-Scale Video Generative Models
This package allows macOS Finder to display thumbnails, static QuickLook previews, cover art and metadata for most types of video files.
A collection of community-written plugins and extensions for PyTorch's torchdata library.
[WIP] VoiceSmith makes training text to speech models easy.
Materials for Hawley's Deep Learning & AI Ethics course
GitHub mirror of our basic C++ plotting library
A collection of basic effects, available as open-source (MIT) C++ classes.
Musical mel transform for semi/quarter-tone features, written in ONNX-compatible PyTorch for audio AI neural networks
The official code repository for SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Transcription.
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Audio decoding libraries for C/C++, each in a single source file.
PyTorch Implementation of TCSinger 2(ACL 2025): Customizable Multilingual Zero-shot Singing Voice Synthesis
PyTorch Implementation of VersBand(EMNLP 2025): Versatile Framework for Song Generation with Prompt-based Control
PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
[ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
Dora is an experiment management framework. It expresses grid searches as pure python files as part of your repo. It identifies experiments with a unique hash signature. Scale up to hundreds of exp…
Framework for writing deep learning training loops. Lightweight, and retaining full freedom to design as you see fits. It handles checkpointing, logging, distributed, compatibility with Dora, and m…
[EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion
An API wrapper for Discord written in Python.
Approaching (Almost) Any Machine Learning Problem
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
GLSL shaders in VDMX's Interactive Shader Format
FLAMO: Frequency-sampling Library for Audio-Module Optimization
Join the community on Discord for more discussions around Neutone! https://discord.gg/VHSMzb8Wqp
A port of the Book of Shaders to TouchDesigner