Stars
Quickly generate HTML documentation from a JSON schema
Lightning fast hash functions using hand-tuned WebAssembly binaries
Grafana Databricks integration allowing direct connection to Databricks to query and visualize Databricks data in Grafana.
Create an external brain for your facts, ask questions about it later
Tool to fix _spark_metadata from Structured Streaming queries
Implements support for resource storage against multiple popular providers via apache-libcloud (S3, Azure Storage, etc...)
Setup for running Trino with Hive Metastore on Kubernetes
😎 A curated list of awesome MLOps tools
NMEA2000 CLI parser for Actisense-Serial NGT-1 Gateway and saving data to InfluxDB via UDP and publishing it with MQTT
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
A mini framework for writing Tableau Web Data Connectors.
A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems", which is `dmls-book`
A smart table display panel with fixed, updating rows (or columns)
General purpose framework to run CMS experiment workflows on HDFS/Spark platform
Data and methodology for the Big Mac index
A plugin for Apache Airflow that exposes rest end points for the Command Line Interfaces
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.