Machine Learning Engineer | NLP & Generative AI | Drug Discovery | MLOps
I'm a machine learning engineer with 8+ years of experience building AI systems across healthcare and drug discovery. My work blends NLP, generative models, LLMs, and MLOps to build production-grade AI systems. I’ve published in peer-reviewed journals, deployed AI tools at scale, and worked with cross-functional teams in fast-paced startups and global R&D orgs.
- 🧠 LLMs & NLP: Prompt-tuning, retrieval-augmented generation, Transformers, semantic search, QA agents
- 🧬 Drug Discovery: Molecular generation, retrosynthesis, protein–ligand modeling
- ⚙️ MLOps & Deployment: AzureML, AWS SageMaker, DVC, Docker, CI/CD pipelines
- 🧾 Data Engineering: SQL/NoSQL, Spark, data pipelines for clinical, biomedical & text data
- 🔬 AI Research: 1st-author paper in Journal of Chemical Information and Modeling, others in ICML CompBio & Springer
Languages: Python, SQL, Bash
Frameworks: PyTorch, Transformers, XGBoost, scikit-learn, FastAPI
MLOps: DVC, MLflow, Docker, Poetry, AzureML, AWS, Kubernetes (basic)
Tools: LangChain, Hugging Face, Neo4j, Spark, Databricks, FAISS
Databases: PostgreSQL, MongoDB, S3, BigQuery
Infra as Code: Terraform (basic)
-
💊 Chem42 Molecule Generator
Built generative model pipeline (GNN + validity filtering) for real-world synthesizable molecules. -
🎯 LLM-powered Retrosynthesis
Cleaned and generated 10M+ reaction entries using OpenAI + Hugging Face. Boosted top-5 retrosynthesis accuracy by 7%. -
🔎 Semantic Healthcare Knowledge Graph
Used PyTorch, Spark, and biomedical ontologies to create 2M+ node graph. Enabled concept-level document retrieval.
-
Model for Predicting Protein–Ligand Unbinding Kinetics through Machine Learning
Journal of Chemical Information and Modeling, ACS, 2020Introduced a static-structure-based approach to predict log(k_off) using RF + structural descriptors.
-
High Performance of Gradient Boosting in Binding Affinity Prediction ICML, CompBio Workshop, 2022
Benchmarked SOTA GNNs vs. GBDTs for protein–ligand binding affinity; showed GBDTs outperform with graph-derived features.
-
Others in RSC AI in Chemistry, Springer Lecture Notes
- 📧 Email: [email protected]
- 🌍 Based in Kazakhstan | Open to remote & relocation
“I like building real things with real impact.”