-
Kyushu University
- https://rishiyama.github.io/
- https://orcid.org/0009-0007-0162-9950
Highlights
- Pro
Stars
Official code of "LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer"
(CVPR 2025) Code of "Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models"
(Siggraph Asia 2025) Code of "LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization"
Script for parsing kanji data from the KANJIDIC2 project
[CVPR 2025] Mr. DETR: Instructive Multi-Route Training for Detection Transformers
The ultimate training toolkit for finetuning diffusion models
UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
Official repository for CalliReader: Contextualizing Chinese Calligraphy via an Embedding-aligned Vision Language Model [ICCV 2025]
A comprehensive list of awesome document image rectification papers.
[CVPR 2025 Highlight] TinyFusion: Diffusion Transformers Learned Shallow
Vector (and Scalar) Quantization, in Pytorch
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
[ICCV2025] Official implementation for “OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography”
Minimal Implementation of a D3PM in pytorch
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight) / / / / When Does Perceptual Alignment Benefit Vision Representations? (NeurIPS 2024)
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
[ICCV'23 Oral] The introduction and toolkit for EqBen Benchmark
PyTorch implementation of "Neural Optimal Transport" (ICLR 2023 Spotlight)
a collection of AWESOME things about Optimal Transport in Deep Learning
Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
This repo contains the code for 1D tokenizer and generator
[CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution'
Example code for the Siggraph Asia Tutorial CreativeAI
Deep Learning & Information Bottleneck