Skip to content
View dxpnkil's full-sized avatar
🌷
Focusing
🌷
Focusing

Block or report dxpnkil

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

Jupyter Notebook 36,200 7,381 Updated Jan 13, 2026

A topic-centric list of HQ open datasets.

72,080 11,090 Updated Jan 14, 2026

Apache Flink

Java 25,708 13,825 Updated Jan 15, 2026

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.

Go 3,089 1,454 Updated Jan 12, 2026

Collection of publicly available IPTV channels from all over the world

TypeScript 109,637 5,349 Updated Jan 15, 2026

Upserts, Deletes And Incremental Processing on Big Data.

Java 6,066 2,459 Updated Jan 15, 2026
Python 2 Updated Jan 14, 2026

A repository of links with advice related to grad school applications, research, phd etc

2,406 220 Updated Nov 12, 2023

Apache Flink Playgrounds

Java 547 256 Updated Feb 13, 2025

Code at the speed of thought – Zed is a high-performance, multiplayer code editor from the creators of Atom and Tree-sitter.

Rust 73,327 6,615 Updated Jan 15, 2026

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 40,777 7,112 Updated Jan 15, 2026

Apache Spark docker image

Shell 2,061 702 Updated Apr 21, 2023

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Scala 1,371 783 Updated Jan 28, 2025

Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White

Makefile 3,509 2,549 Updated Mar 17, 2020

A curated list of references for MLOps

13,534 1,999 Updated Nov 21, 2024

A collection of challenge based hack-a-thons including student guide, coach guide, lecture presentations, sample/instructional code and templates. Please visit the What The Hack website at: https:/…

C# 1,863 884 Updated Jan 9, 2026

MLOps examples

Jupyter Notebook 2,065 586 Updated Aug 2, 2024

Roadmap to becoming a data engineer in 2021

12,728 1,360 Updated Jan 25, 2022

Dockerfile for Apache Kafka

Shell 6,982 2,699 Updated May 8, 2024

Docker image for Apache JMeter

Shell 284 312 Updated Sep 24, 2024

A curated list of awesome Apache Spark packages and resources.

Shell 1,855 345 Updated Oct 24, 2024

A beginner-friendly yet powerful Python toolkit for financial analysis and automation — built to make modern investing accessible to everyone

Python 1,090 233 Updated Jan 6, 2026

A list of useful resources to learn Data Engineering from scratch

3,934 565 Updated Jun 19, 2024

Apache Iceberg

Java 8,430 2,964 Updated Jan 14, 2026

Open Source Computer Vision Library

C++ 85,717 56,481 Updated Jan 14, 2026

Apache Superset is a Data Visualization and Data Exploration Platform

TypeScript 70,068 16,516 Updated Jan 15, 2026

Learn how to design, develop, deploy and iterate on production-grade ML applications.

Jupyter Notebook 45,770 7,174 Updated Aug 18, 2024

Spark: The Definitive Guide's Code Repository

Scala 3,086 2,885 Updated Aug 26, 2020

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

Python 27,846 4,911 Updated Aug 18, 2024
Next