Skip to content
View hasaki321's full-sized avatar

Block or report hasaki321

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 495 17 Updated Nov 12, 2025

Sylber: Syllabic Embedding Representation of Speech from Raw Audio

Jupyter Notebook 68 4 Updated Mar 17, 2025

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python 4,356 315 Updated Jun 21, 2025

DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast voice synthesis.🐙

Python 46 4 Updated Nov 17, 2025

High-performance Image Tokenizers for VAR and AR

Python 297 6 Updated Apr 25, 2025

[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Python 1,413 64 Updated Mar 16, 2025

Advanced GRAG implementation for ComfyUI with beginner-friendly and expert modes

Python 12 2 Updated Nov 6, 2025

https://little-misfit.github.io/GRAG-Image-Editing/

Python 102 2 Updated Nov 5, 2025

Official implementation of "Continuous Autoregressive Language Models"

Python 582 71 Updated Nov 10, 2025

FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

Python 32 3 Updated Nov 4, 2025

ARTalk generates realistic 3D head motions (lip sync, blinking, expressions, head poses) from audio in ⚡ real-time ⚡.

Python 100 16 Updated Jun 12, 2025

A collection of awesome text-to-image generation studies.

TeX 702 35 Updated Oct 23, 2025

PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437

Python 1,193 65 Updated Feb 25, 2025

This is a repo to track the latest autoregressive visual generation papers.

409 5 Updated Jun 25, 2025

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,787 109 Updated Sep 27, 2024

[ICCV 25]SpectralAR: Spectral Autoregressive Visual Generation

35 1 Updated Jun 13, 2025

[CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization

Python 46 3 Updated Jul 22, 2025

Implementation of "Hyperspherical Latents Improve Continuous-Token Autoregressive"

Python 77 6 Updated Nov 15, 2025

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,479 546 Updated Nov 10, 2025

Meaningful titles for tabs and PDF downloads! Also supports tab search.

JavaScript 329 24 Updated Sep 13, 2025

Contexts Optical Compression

Python 20,578 1,773 Updated Oct 25, 2025

😎 Awesome lists about all kinds of interesting topics

415,530 32,275 Updated Nov 12, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

4,206 249 Updated Sep 19, 2025

The source code of NFIG

Python 4 Updated Oct 14, 2025

A curated list of awesome autoregressive papers in Generative AI

125 4 Updated Sep 26, 2025

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,701 1,165 Updated Nov 14, 2024

Foundation Models and Data for Human-Human and Human-AI interactions.

Python 312 18 Updated Aug 16, 2025

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Python 698 155 Updated Jul 12, 2022

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 1,144 141 Updated Sep 5, 2024

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 13,606 2,000 Updated Nov 9, 2025
Next