Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Official implementation for "DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion".
ComfyUI node brings DyPE to FLUX, enabling artifact-free 4K+ image generation
JoyCaption is an image captioning Visual Language Model (VLM) being built from the ground up as a free, open, and uncensored model for the community to use in training Diffusion models.
Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
小白自建代理神器!ArgoSBX一键无交互小钢炮脚本💣:Sing-box、Xray、Argo三内核自动搭配;支持VPS、Docker、容器多环境部署;套CDN的4大方案+套WARP的15种组合;已支持协议:AnyTLS、Any-reality、Vless-xhttp-reality-vision-enc、Vless-tcp-reality-vision、Vless-xhttp-vision-…
软硬路由公网神器,ipv6/ipv4 端口转发,反向代理,DDNS,WOL,ipv4 stun内网穿透,cron,acme,rclone,ftp,webdav,filebrowser
An Open-Sourced LLM-empowered Foundation TTS System
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Multi-Platform Package Manager for Stable Diffusion
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
In order to make it easier to use the ComfyUI, I have made some optimizations and integrations to some commonly used nodes.
ComfyUI node that let you pick the way in which prompt weights are interpreted
Clip text encoder with BREAK formatting like A1111 (uses conditioning concat)
Custom nodes for ComfyUI such as CLIP Text Encode++
CF-workers/pages代理脚本【Vless与Trojan】:支持nat64自动生成proxyip,一键自建proxyip与CF反代IP,CF优选官方IP三地区应用脚本,自动输出美、亚、欧最佳优选IP
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
Kimi K2 is the large language model series developed by Moonshot AI team
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …