Stars
Prevents you from committing secrets and credentials into git repositories
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
DoEKS is a tool to build, deploy and scale Data Platforms on Amazon EKS
An example CDK app demonstrating how CDK can be utilized to create Amazon DataZone Resources.
A high-throughput and memory-efficient inference and serving engine for LLMs
Curated coding interview preparation materials for busy software engineers
Databricks framework to validate Data Quality of pySpark DataFrames and Tables
📚 Tech blogs & talks by companies that run Kafka in production
📚 Tech blogs & talks by companies that run Apache Flink in production
Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.
This is a repo with links to everything you'd ever want to learn about data engineering
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational m…
A simple threat modeling tool to help humans to reduce time-to-value when threat modeling
The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
This Guidance demonstrates how to streamline data access management through the integration of Amazon DataZone and Jira ticketing systems
QuickSight artifacts with CDK, CodePipeline, CLI scripts
This repository contains the infrastructure as code to wrap your AWS CDK project with CI/CD around it.
A multi-formalism, multi-solution model-checker centered on the language GAL
This repository contains the infrastructure as code to bootstrap your next CI/CD project. It is developed with security best practices in mind, provides a robust and automated deployment process th…
The OpenTF Manifesto expresses concern over HashiCorp's switch of the Terraform license from open-source to the Business Source License (BSL) and calls for the tool's return to a truly open-source …
Projet de modélisation d'un livre "dont vous êtes le héros" dans le cadre d'un cour de M1 au sein de Sorbonne Université
MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.