-
Facebook
- San Francisco
Stars
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
A realtime serving engine for Data-Intensive Generative AI Applications
📑 PageIndex: Document Index for Reasoning-based RAG
A flexible, adaptive classification system for dynamic text classification
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
OCR toolbox from Davar-Lab
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.
Fault-tolerant async actors for Rust that scale seamlessly
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Mod…
Guideline following Large Language Model for Information Extraction
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
A cloud native embedded storage engine built on object storage.
WIP - Allows you to create DSPy pipelines using ComfyUI
Durable workflow automation in just a few lines of code
Query your PDF documents and get more insights from them
In-memory vector store with efficient read and write performance for semantic caching and retrieval system. Redis for Semantic Caching.
LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progres…
SQLSync is a collaborative offline-first wrapper around SQLite. It is designed to synchronize web application state between users, devices, and the edge.
Rate limiting, caching, and request prioritization for modern workloads
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.