Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Tools for various benchmarking scenarios of the Weaviate Query Agent
cuVS - a library for vector search and clustering on the GPU
LOFT: A 1 Million+ Token Long-Context Benchmark
Highly Performant, Modular, Memory Safe and Production-ready Inference, Ingestion and Indexing built in Rust π¦
FastAPI framework, high performance, easy to learn, fast to code, ready for production
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
High-Performance Engine for Multi-Vector Search
What does gpt-oss tell us about OpenAI's training data?
How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models
Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.
Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism (NIPS'25)
Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³
Tongyi Deep Research, the Leading Open-source Deep Research Agent
XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
On the Theoretical Limitations of Embedding-Based Retrieval
ππ§ PageIndex: Document Index for Reasoning-based RAG
XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.