Skip to content
View zhangmozhi's full-sized avatar

Block or report zhangmozhi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A collection of tricks and tools to speed up transformer models

TeX 187 10 Updated Oct 29, 2025

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,842 365 Updated Dec 7, 2024

大麦网抢票脚本

Python 5,248 918 Updated Mar 13, 2024

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,793 148 Updated Jun 17, 2025

Repo for external large-scale work

Python 6,546 721 Updated Apr 27, 2024

MLNLP社区用来更好进行论文搜索的工具。Fully-automated scripts for collecting AI-related papers

Python 1,164 119 Updated Dec 16, 2023

Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料

989 82 Updated Oct 17, 2022

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Python 10,622 2,091 Updated Nov 3, 2023

中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…

Python 77,088 15,057 Updated May 10, 2024

中文人名语料库。人名生成器。中文姓名,姓氏,名字,称呼,日本人名,翻译人名,英文人名。可用于中文分词、人名实体识别。

4,214 1,009 Updated Nov 9, 2025

Multilingual Reply Suggestion

Python 6 2 Updated Jul 25, 2024

The project proposes a framework to apply topic models on a text-corpus and eventually topic labels on the generated topics.

Jupyter Notebook 35 8 Updated May 14, 2024

Code accompanying EMNLP 2020 paper "Interactive Refinement of Cross-Lingual Word Embeddings".

Python 2 1 Updated Jul 28, 2021

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Python 299 56 Updated Jul 25, 2024

Arsenal of python utilities.

Python 273 56 Updated Apr 11, 2025

scripts help chinese netizen, who uses vpn to combat censorship, by modifying the route table so as routing only the censored ip to the vpn

Python 3,122 637 Updated Sep 6, 2018

Speedtest script, including PING and DOWNLOAD sorting.

Python 14 5 Updated Dec 15, 2012

Spectacle allows you to organize your windows without using a mouse.

Objective-C 13,650 838 Updated Jan 15, 2022

Vim plugin for intensely nerdy commenting powers

Vim Script 5,010 445 Updated Apr 30, 2025

Auto detect CJK and Unicode file encodings.

Vim Script 66 6 Updated Oct 12, 2021

pathogen.vim: manage your runtimepath

Vim Script 12,144 1,154 Updated Aug 24, 2022