WenmuZhou

zhoujun WenmuZhou

675 followers · 55 following

shanghai
https://www.zhihu.com/people/wenmu_zj

Lists (4)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

框架

1 repository

Stars

WenmuZhou / Torch_Quant_Demo

一个使用torch进行量化训练的demo

Python 13 Updated Aug 21, 2023

RubanSeven / Text-Image-Augmentation-python

Python implementation of Text-Image-Augmentation

Python 247 50 Updated Jun 3, 2020

WenmuZhou / OCR_DataSet

收集并整理有关OCR的数据集并统一标注格式，以便实验需要

Python 949 199 Updated Nov 28, 2023

songdejia / EAST

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

Python 578 146 Updated Apr 6, 2019

apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

C++ 20,833 6,752 Updated Oct 25, 2023

ArvinMei / py2so

编译py为so文件，更好的隐藏源码

Python 229 126 Updated Apr 21, 2020

whai362 / PSENet

Official Pytorch implementations of PSENet.

Python 1,187 343 Updated Apr 7, 2023

OpenGVLab / SDLM

Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation length and maintaining KV-cache compatibility, achieving high eff…

Python 70 1 Updated Oct 17, 2025

AIDC-AI / Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 1,405 82 Updated Sep 22, 2025

ShaohonChen / Qwen3-SmVL

将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调

Python 429 45 Updated Sep 8, 2025

chatdoc-com / OCRFlux

OCRFlux is a lightweight yet powerful multimodal toolkit that significantly advances PDF-to-Markdown conversion, excelling in complex layout handling, complicated table parsing and cross-page conte…

Python 2,379 146 Updated Aug 4, 2025

TGSAN / CMWTAT_Digital_Edition

CloudMoe Windows 10/11 Activation Toolkit get digital license, the best open source Win 10/11 activator in GitHub. GitHub 上最棒的开源 Win10/Win11 数字权利（数字许可证）激活工具！

C# 18,578 2,146 Updated Jul 2, 2025

zjx0101 / ObjectClear

ObjectClear: Complete Object Removal via Object-Effect Attention

Python 496 35 Updated Sep 21, 2025

wtybest / FreeFlux

[ICCV 2025] FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing

Jupyter Notebook 65 1 Updated Sep 3, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 27,853 2,767 Updated Apr 30, 2025

OPPO-Mente-Lab / X2I

Official code for ICCV 2025 paper, X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation

Python 86 3 Updated Jun 26, 2025

camel-ai / owl

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 18,327 2,123 Updated Sep 24, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 11,033 968 Updated Nov 13, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,693 2,129 Updated Jul 17, 2025

Alpha-VLLM / Lumina-Image-2.0

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Python 818 57 Updated Nov 3, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,640 2,399 Updated Sep 8, 2025

longtaojiang / SmartEraser

[CVPR 2025] Official implementation of the paper "SmartEraser: Remove Anything from Images using Masked-Region Guidance".

Python 175 8 Updated Jul 1, 2025

leeruibin / RORem

[CVPR2025] RORem: Training a Robust Object Remover with Human-in-the-Loop

Python 57 2 Updated Sep 9, 2025

deepseek-ai / DeepSeek-V3

Python 100,267 16,336 Updated Aug 28, 2025

Picsart-AI-Research / StreamingT2V

[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Python 1,610 160 Updated Mar 27, 2025

Anonym0u3 / AttentiveEraser

Official implementation of the paper "Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance" (AAAI 2025 Oral)

Jupyter Notebook 198 9 Updated May 9, 2025

TencentARC / BrushEdit

[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"

Python 581 28 Updated Sep 3, 2025

OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.

Python 780 66 Updated Aug 27, 2025

bcmi / Awesome-Generative-Image-Composition

A curated list of papers, code, and resources pertaining to generative image composition or object insertion.

Python 139 7 Updated Oct 25, 2025