Skip to content
View saidbouras's full-sized avatar

Block or report saidbouras

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated list of awesome warez and piracy links

HTML 25,883 2,239 Updated Jun 18, 2023

Fluss with Iceberg integration

Dockerfile 5 1 Updated Oct 14, 2025

Manage your database schema as code

Go 7,635 324 Updated Nov 11, 2025

21 Lessons, Get Started Building with Generative AI

Jupyter Notebook 102,470 54,537 Updated Nov 24, 2025

Learn Python using your Java Knowledge

Python 60 79 Updated Dec 9, 2019

Converting a json schema to a spark schema (struct) representation

Scala 12 5 Updated Mar 18, 2025

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

Java 3,093 1,222 Updated Nov 29, 2025

A Model Context Protocol (MCP) server for discovering data products and requesting access in Data Mesh Manager, and executing queries on the data platform to access business data.

Python 43 3 Updated Oct 22, 2025

Interactive CLI for analyzing Kafka health and configuration according to best practices and industry standards.

JavaScript 81 7 Updated Nov 27, 2025

Testing framework for Databricks notebooks

Python 312 45 Updated Apr 20, 2024

Python Testing for Databricks

Python 101 10 Updated Oct 17, 2025

An example showing how to apply software engineering best practices to Databricks notebooks.

Python 146 73 Updated Jul 24, 2024

The Metadata Platform for your Data and AI Stack

Java 11,259 3,291 Updated Nov 29, 2025

POC of a Spring Boot - DataHub integration reporting its data lineage.

Java 3 1 Updated Jun 13, 2025

Serialization format for row-based incremental data processing

Rust 141 12 Updated Jul 15, 2025

⚡ Fastest SQL ETL pipeline in a single C++ binary, built for stream processing, observability, analytics and AI/ML

C++ 2,092 96 Updated Nov 28, 2025

Modern observability platform: 10x easier, 140x lower storage cost, petabyte scale. Open-source alternative to Elasticsearch/Splunk/Datadog for logs, metrics, traces, RUM, and more.

Rust 17,376 702 Updated Nov 29, 2025

⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

Python 2,242 248 Updated Nov 28, 2025

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies. uForwarder aims to address several pain points while using Apache Kafka for pub-sub message queue…

Java 94 15 Updated Oct 7, 2025

Used to generate mock Avro data

Java 7 2 Updated May 17, 2024

Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme

Python 12,441 1,351 Updated Nov 25, 2025

Docker container with a data volume from s3.

Shell 277 68 Updated Mar 28, 2024
Python 78 42 Updated Jul 9, 2025

A Kubernetes controller to watch changes in ConfigMap and Secrets and do rolling upgrades on Pods with their associated Deployment, StatefulSet, DaemonSet and DeploymentConfig – [✩Star] if you're u…

Go 9,382 618 Updated Nov 24, 2025

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Scala 3,545 573 Updated Nov 4, 2025

Enforce Data Contracts

Python 741 183 Updated Nov 28, 2025

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 43,899 3,018 Updated Nov 27, 2025

A library that provides an in-memory Kafka instance to run your tests against.

Scala 410 46 Updated Nov 26, 2025

Secure and fast microVMs for serverless computing.

Rust 31,206 2,152 Updated Nov 28, 2025

Open-source search and retrieval database for AI applications.

Rust 24,644 1,941 Updated Nov 28, 2025
Next