Stars
Flexible and powerful framework for managing multiple AI agents and handling complex conversations
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
CC signals is a framework for a simple pact between those stewarding data, and those reusing it for AI development. CC signals provide a set of shared ground rules for an AI ecosystem that is mutua…
A PyTorch native platform for training generative AI models
Example projects and demos around data streaming , stream processing, change data capture, and more.
Self-contained worked examples of Apache Lucene features and functionality
This is a basic example about the setup and use of SQLMesh.
A book describing how to set up and maintain Data Engineering infrastructure using Google Cloud Platform.
Data Engineering on Google Cloud Platform
List of changes announced for AWS that may break existing code
Cloud native secrets management for developers - never leave your command line for secrets.
Import Letterboxd movie list (diary) into trakt.tv
Data Analysis Workflows & Reproducibility Learning Resources
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
The official Python SDK for the Foundry API
JupyterLab desktop application, based on Electron.
This repository is a production dbt pipeline example that model the profitability of an e-commerce business. Data is extracted and loaded to a BigQuery dwh by Airbyte. Data sources include Shopify,…
Source for Google Click to Deploy solutions listed on Google Cloud Marketplace.
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Code for "Efficient Data Processing in Spark" Course
Devon: An open-source pair programmer
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Scalable and efficient data transformation framework - backwards compatible with dbt.