This repo contains tasks from GCP "Professional Data Engineer Certification Learning Path"
Link: https://partner.cloudskillsboost.google/paths/85
Topics/Folders:
- Dataflow: #8 "Serverless Data Processing with Dataflow: Develop Pipelines"
- 1_Basic_ETL_Python: Serverless Data Processing with Dataflow - Writing an ETL Pipeline using Apache Beam and Dataflow (Python)
- 2_Branching_Pipelines: Serverless Data Processing with Dataflow - Branching Pipelines (Python)
- 3_Batch_Analytics_Python: Serverless Data Processing with Dataflow - Batch Analytics Pipelines with Dataflow (windowing features w/ Python)
- 5_Streaming_Analytics_Python: Serverless Data Processing with Dataflow - Using Dataflow for Streaming Analytics (Python)
- 7_Advanced_Streaming_Analytics: Serverless Data Processing with Dataflow - Advanced Streaming Analytics Pipeline with Dataflow (Python)
-
Arch_Java_Code: all JAVA code archived
-
VertexAI: tasks about GCP VertexAI
Note: just to prepare "Professional Data Engineer" certification exam