Stars
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
A list of awesome papers and resources of recommender system on large language model (LLM).
HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling
推荐/广告/搜索领域工业界经典以及最前沿论文集合。A collection of industry classics and cutting-edge papers in the field of recommendation/advertising/search.
A Toolbox for MultiModal Recommendation. Integrating 10+ Models...
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
天池学习赛 零基础入门推荐系统 正式赛 第三名(0.2592) 开源代码
Simple image captioning model
This repository contains script to divide a video into key frames.
[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio
Research Code for Multimodal-Cognition Team in Ant Group
Note: DO NOT USE IT! THIS CODE IS PROVEN TO CONTAIN DATA LEAKAGE! Archive version of "Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval (CVPR 2024 Highlight)"
Robust Speech Recognition via Large-Scale Weak Supervision
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
中文nlp解决方案(大模型、数据、模型、训练、推理)
涵盖LeetCode、剑指offer、手撕代码高频算法题、ML重点知识点以及概率智力题等
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
Google Research
Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval." CVPR 2022
Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.