A dedicated Data Engineer with a strong background in Software Architecture and Microservices, focused on building scalable, reliable, and high-performance data systems.
I specialize in transforming complex datasets into actionable insights using modern open-source technologies such as ClickHouse, Polars, DuckDB, and Kafka โ all while ensuring data quality, observability, and performance.
Over 5 years of hands-on experience designing and implementing end-to-end data platforms, from ingestion and orchestration to serving APIs.
Currently contributing to market-surveillance and real-time analytics systems at Iran FaraBourse (IFB) and leading data architecture design at a BioTech startup in Sweden.
- Data ingestion & transformation (ETL/ELT) using Prefect, Airflow, MageAI
- Real-time & streaming pipelines with Kafka, Redpanda, and ClickHouse
- DataLakehouse architecture with Iceberg, DuckLake, MinIO (S3)
- Advanced analytics using Polars, DuckDB, and Apache Arrow
- FastAPI, gRPC, AsyncIO, Celery for high-performance APIs
- Designed an internal FaaS platform at IFB for dynamic, scalable gRPC endpoints
- Experience with distributed systems and event-driven design
- Built real-time data quality monitoring with Prometheus, Grafana, and OpenTelemetry
- Implemented unified logging and metrics systems across multiple data services
- ClickHouse, PostgreSQL, MongoDB, Neo4j, Qdrant, Elasticsearch
- Schema design, indexing strategies, and query optimization for OLAP/OLTP workloads
- Integration of LangChain and OpenAI for document embedding and semantic search
- Built vector search pipelines using Qdrant (768D cosine embeddings)
- Designed microservices for LLM-powered analytics and biomedical knowledge graphs
Jan 2025 โ Present
- Designed full data & analytics workflows using DuckDB, Polars & Apache Arrow
- Modeled a biomedical knowledge graph in Neo4j and implemented vector search with Qdrant
- Developed a gRPC microservice exposing vector retrieval and search APIs
Oct 2024 โ Present
- Built in-house FaaS platform with dynamic gRPC endpoints serving 5+ internal systems
- Refactored 10+ Prefect pipelines, improving reliability from 85% โ 99.9%
- Migrated analytical schemas to ClickHouse, achieving 40% faster queries
Sep 2023 โ Dec 2024
- Built ETL/data pipelines with Apache Airflow & MageAI
- Deployed PostgreSQL-based warehouse cutting analytics cost by 30%
- Developed data quality monitoring via Prometheus + Grafana
Sep 2023 โ Jan 2024
- Refactored pipelines to reduce latency and improve I/O throughput
- Designed gRPC + SQS integrations for data flow between services
May 2023 โ Sep 2023
- Built ingestion pipelines reducing data lag by 2 hours
- Used Azure OpenAI to generate SWOT analyses from embedded documents
Aug 2021 โ May 2023
- Developed trading assistant (Atlas) using Kafka & PostgreSQL
- Reduced validation time by 40%, enabling low-latency trading insights
Languages: Python, SQL, Shell, Rust (Intermediate)
Frameworks: FastAPI, gRPC, AsyncIO, Celery
Data Tools: ClickHouse, Kafka, DuckDB, Polars, Prefect, Airflow, MageAI
Storage & Lakehouse: MinIO, Iceberg, DuckLake, Dremio, Nessie
Databases: PostgreSQL, MongoDB, Neo4j, Qdrant, Elasticsearch
Monitoring: Prometheus, Grafana, OpenTelemetry
DevOps: Docker, GitHub Actions, GitLab CI, Kubernetes
Visualization: Metabase, Superset, Grafana
- ๐ Dremio Verified Lakehouse Associate
- ๐งฑ Data Engineering Essentials
- ๐ ETL and Data Pipelines with Shell, Airflow, and Kafka
- ๐๏ธ Relational Database Administration Essentials
- ๐ Python Project for Data Engineering
- ๐งฎ LangChain for LLM Application Development
Bachelorโs Degree in Software Engineering
Islamic Azad University (2014 โ 2018) | GPA: 3.05/4.0
๐ โData without context is noise โ my mission is to turn that noise into signal.โ