Stars
GRID: Generative Recommendation with Semantic IDs
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
这是一个seq2seq模型,编码器是bert,解码器是transformer的解码器,可用于自然语言处理中文本生成领域的任务
Vector (and Scalar) Quantization, in Pytorch
RUCAIBox / LC-Rec
Forked from zhengbw0324/LC-Rec[ICDE'24] Code of "Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation."
HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling
[NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
Decoupling common and unique representations for multimodal self-supervised learning
Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)
A series of large language models trained from scratch by developers @01-ai
a state-of-the-art-level open visual language model | 多模态预训练模型
pkunlp-icler / MIC
Forked from HaozheZhao/MICMMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Code and documentation to train Stanford's Alpaca models, and generate the data.
VideoX: a collection of video cross-modal models
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Instruction Tuning with GPT-4
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)