Skip to content
View thomaschhh's full-sized avatar

Organizations

@fraunhofer-iais @Modalities

Block or report thomaschhh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"

Python 10,806 1,472 Updated Nov 20, 2025

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 11,742 1,717 Updated Apr 26, 2025

Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech

Python 32 4 Updated Sep 18, 2024

A lightweight LMM-based Document Parsing Model

Python 6,302 435 Updated Nov 19, 2025

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

Python 7,825 643 Updated Nov 6, 2025

Text-audio foundation model from Boson AI

Python 7,667 568 Updated Sep 15, 2025

ACE-Step: A Step Towards Music Generation Foundation Model

Python 3,338 391 Updated Jun 27, 2025

Super-fast Structured Outputs

Rust 619 41 Updated Nov 24, 2025

DFloat11: Lossless LLM Compression for Efficient GPU Inference

Python 563 33 Updated Nov 24, 2025

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,672 297 Updated Aug 14, 2025

tiny vision language model

Python 8,941 692 Updated Nov 14, 2025

OmniGen2: Exploration to Advanced Multimodal Generation.

Jupyter Notebook 3,947 9 Updated Sep 30, 2025

Converts text to speech in realtime

Python 3,640 351 Updated Jul 22, 2025

Grundlagenskript fuer Tonmeisterstudenten (2000)

TeX 6 Updated Jul 13, 2025
Python 63 3 Updated Jan 27, 2025

Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin9…

Python 23,870 2,596 Updated Nov 15, 2025

FLUX, Stable Diffusion, SDXL, SD3, LoRA, Fine Tuning, DreamBooth, Training, Automatic1111, Forge WebUI, SwarmUI, DeepFake, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, Comfy…

JavaScript 2,595 355 Updated Nov 25, 2025
19 2 Updated Jun 13, 2024

A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.

Python 2,721 439 Updated Oct 4, 2025

Official codes of CCSRv2 and CCSRv1: Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-Resolution

Python 576 45 Updated Jul 17, 2025

🌐 Make websites accessible for AI agents. Automate tasks online with ease.

Python 72,976 8,695 Updated Nov 26, 2025

Machine Learning for Imbalanced Data, published by Packt

Jupyter Notebook 277 81 Updated Nov 11, 2025

Instructional notebooks on music information retrieval.

Jupyter Notebook 1,258 416 Updated Nov 3, 2025

Understanding Deep Learning - Simon J.D. Prince

Jupyter Notebook 8,510 1,962 Updated Nov 18, 2025

[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior

Python 3,951 346 Updated Jul 29, 2025