Skip to content
View molspace's full-sized avatar

Block or report molspace

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
molspace/README.md

👋 Hi, I'm Nurlybek Temirbay

Machine Learning Engineer | NLP & Generative AI | Drug Discovery | MLOps

Visitors

I'm a machine learning engineer with 8+ years of experience building AI systems across healthcare and drug discovery. My work blends NLP, generative models, LLMs, and MLOps to build production-grade AI systems. I’ve published in peer-reviewed journals, deployed AI tools at scale, and worked with cross-functional teams in fast-paced startups and global R&D orgs.


🚀 What I Do

  • 🧠 LLMs & NLP: Prompt-tuning, retrieval-augmented generation, Transformers, semantic search, QA agents
  • 🧬 Drug Discovery: Molecular generation, retrosynthesis, protein–ligand modeling
  • ⚙️ MLOps & Deployment: AzureML, AWS SageMaker, DVC, Docker, CI/CD pipelines
  • 🧾 Data Engineering: SQL/NoSQL, Spark, data pipelines for clinical, biomedical & text data
  • 🔬 AI Research: 1st-author paper in Journal of Chemical Information and Modeling, others in ICML CompBio & Springer

🛠 Tech Stack

Languages: Python, SQL, Bash
Frameworks: PyTorch, Transformers, XGBoost, scikit-learn, FastAPI
MLOps: DVC, MLflow, Docker, Poetry, AzureML, AWS, Kubernetes (basic)
Tools: LangChain, Hugging Face, Neo4j, Spark, Databricks, FAISS
Databases: PostgreSQL, MongoDB, S3, BigQuery
Infra as Code: Terraform (basic)


📌 Featured Projects

  • 💊 Chem42 Molecule Generator
    Built generative model pipeline (GNN + validity filtering) for real-world synthesizable molecules.

  • 🎯 LLM-powered Retrosynthesis
    Cleaned and generated 10M+ reaction entries using OpenAI + Hugging Face. Boosted top-5 retrosynthesis accuracy by 7%.

  • 🔎 Semantic Healthcare Knowledge Graph
    Used PyTorch, Spark, and biomedical ontologies to create 2M+ node graph. Enabled concept-level document retrieval.


📄 Publications


📫 Get in Touch


“I like building real things with real impact.”

Popular repositories Loading

  1. FastMVS_experiments FastMVS_experiments Public

    Python 1

  2. First-rep First-rep Public

    This is a first test rep. Nothing much to see here

  3. sberhack sberhack Public

    this repo contains our team’s solution for sber hackathon

    Python

  4. mlkoff mlkoff Public

    Jupyter Notebook

  5. adversarial_project adversarial_project Public

  6. public public Public

    Forked from datagrok-ai/public

    Public package repository for the Datagrok.ai platform

    JavaScript