-
Institute of Computing Innovation, Zhejiang University; Master of The Hong Kong Polytechnic University
- Hangzhou
- https://www.zhihu.com/people/leeshimin
- https://twitter.com/younglishimin?s=21&t=zmqUbpol0Bc6wAIytuAIUQ
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
(ICCV 2025)This repository is the official implementation of AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
iFlow cli is a comprehensive command-line intelligence that embeds in your terminal, analyzes your repositories, does coding tasks, interprets your needs across contexts, and boosts efficiency by p…
Wan: Open and Advanced Large-Scale Video Generative Models
Wan: Open and Advanced Large-Scale Video Generative Models
SGLang is a fast serving framework for large language models and vision language models.
Official inference repo for FLUX.1 models
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
The public code of EMNLP2023 (main conference) paper "TLM: Token-Level Masking for Transformers"
The original Backpack Language Model implementation, a fork of FlashAttention
LAVIS - A One-stop Library for Language-Vision Intelligence
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
QLoRA: Efficient Finetuning of Quantized LLMs
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
MTEB: Massive Text Embedding Benchmark
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Incorporating Instructional Prompts into A Unified Generative Framework for Joint Multiple Intent Detection and Slot Filling - Coling2022(Oral))
[CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation
novel deep learning research works with PaddlePaddle
Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".
Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021
A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset