Skip to content
View diptanu's full-sized avatar
  • Facebook
  • San Francisco

Block or report diptanu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.

Go 1,228 52 Updated Nov 7, 2025

A realtime serving engine for Data-Intensive Generative AI Applications

Rust 1,062 139 Updated Nov 7, 2025

📑 PageIndex: Document Index for Reasoning-based RAG

Python 3,800 275 Updated Nov 7, 2025

A flexible, adaptive classification system for dynamic text classification

Python 493 32 Updated Oct 7, 2025

Table Structure Recognition

Python 27 1 Updated Jul 25, 2024

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

Python 474 59 Updated Jul 20, 2025
Jupyter Notebook 387 59 Updated Jan 7, 2024

OCR toolbox from Davar-Lab

Python 9 2 Updated Jan 8, 2024

A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.

219 11 Updated Sep 9, 2024

Rust actor framework

Rust 1,864 110 Updated Nov 3, 2025

Fault-tolerant async actors for Rust that scale seamlessly

Rust 1,065 50 Updated Nov 7, 2025

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

C++ 1,794 201 Updated Apr 9, 2025

A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Mod…

226 22 Updated Jun 17, 2025

Guideline following Large Language Model for Information Extraction

Python 409 28 Updated Oct 27, 2024

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Python 1,756 135 Updated Apr 14, 2025

CBOR: Concise Binary Object Representation

Rust 80 10 Updated Oct 29, 2025

A cloud native embedded storage engine built on object storage.

Rust 2,429 150 Updated Nov 9, 2025

WIP - Allows you to create DSPy pipelines using ComfyUI

Python 198 10 Updated Dec 1, 2024

Durable workflow automation in just a few lines of code

Go 1,028 34 Updated Nov 9, 2025

Query your PDF documents and get more insights from them

Python 5 Updated Apr 28, 2024

In-memory vector store with efficient read and write performance for semantic caching and retrieval system. Redis for Semantic Caching.

Rust 373 14 Updated Nov 29, 2024

LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progres…

Go 5,951 373 Updated Apr 28, 2025

SQLSync is a collaborative offline-first wrapper around SQLite. It is designed to synchronize web application state between users, devices, and the edge.

Rust 2,821 41 Updated Aug 2, 2025

Rust bindings for the C++ api of PyTorch.

Rust 5,117 406 Updated Nov 4, 2025

CodeXGLUE

C# 1,767 388 Updated Apr 23, 2024

Postgres-native columnar storage extension

C 3,002 93 Updated Feb 10, 2025

Rate limiting, caching, and request prioritization for modern workloads

Go 700 34 Updated May 18, 2025

RiteRaft - A raft framework, for regular people

Rust 333 23 Updated Feb 18, 2024

gamedev blog

3,315 147 Updated Mar 8, 2021

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Python 325,911 53,091 Updated Nov 3, 2025
Next