Stars
Minimal RNN classifier with self-attention in Pytorch
Anserini is a Lucene toolkit for reproducible information retrieval research
PyTorch deep learning models for document classification
Facilitating the design, comparison and sharing of deep text matching models.
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
“Chorus” of recommendation models: a light and flexible PyTorch framework for Top-K recommendation.
KDD'2022: Towards Representation Alignment and Uniformity in Collaborative Filtering
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports comp…
Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.
an unbias-learning-to-rank dataset of Baidu
Instruct-tune LLaMA on consumer hardware
Making large AI models cheaper, faster and more accessible
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
The official repo for our SIGIR'23 Full paper: Structure-aware Pre-trained Language Model for Legal Case Retrieval
The official repo for our SIGIR'23 Full paper: Constructing Tree-based Index for Efficient and Effective Dense Retrieval
T2Ranking: A large-scale Chinese benchmark for passage ranking.
Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2
deepspeed+trainer简单高效实现多卡微调大模型
Build, evaluate, understand, and fix LLM-based apps
🕹️ A basic gameboy emulator with terminal "Cloud Gaming" support
A series of large language models developed by Baichuan Intelligent Technology