I work on applied and research-oriented machine learning, focusing on large language models, deep learning, and AI systems engineering. My interests include:
- Large-scale model fine-tuning and adaptation
- RAG systems and retrieval-based reasoning
- Deep learning for computer vision and natural language processing
- Efficient inference, quantization, and model optimization
- MLOps for reliable, reproducible machine learning
I enjoy building systems that bridge research and production — transforming theoretical advances into deployable, high-impact applications.
-
Large Language Models (LLMs)
Scaling laws · Fine-tuning · Distillation · RAG · Multi-agent systems -
Deep Learning
Representation learning · CNNs · Transformers · Optimization -
Natural Language Processing
Semantics · Document understanding · Retrieval systems · Embeddings -
Computer Vision
Object detection · Lane detection · Visual reasoning -
AI Systems & MLOps
Distributed training · Experiment tracking · Model reliability
Adaptation of open-weight LLMs using Q-LoRA and 4-bit quantization. Implemented full fine-tuning pipelines, evaluation frameworks, and optimized inference stacks using FastAPI and LangChain. Integrated RAG for domain-specific knowledge reasoning.
End-to-end retrieval system using vector databases and encoder models. Converts technical documentation into structured, searchable representations. Achieved high-precision, context-aware responses using LangChain agents and custom retrieval logic.
Developed a multi-agent reasoning architecture for analysis, content generation, and decision support. Combined retrieval, planning, and LLM-based reasoning in a modular framework.
Implemented classical and deep-learning-based CV pipelines. Built YOLO-based object detection and lane detection systems optimized for real-time performance.
End-to-end anomaly detection pipeline with experiment tracking (MLFlow), data versioning (DVC), and automated training/deployment via CI/CD. Ensured reproducibility and reliability across the ML lifecycle.
PyTorch · TensorFlow · Transformers · Embedding Models · Optimization
Q-LoRA · RAG · LangChain · LangGraph · Prompt Engineering
Vector databases (FAISS, Chroma, Pinecone)
MLFlow · DVC · Docker · GitHub Actions · Distributed systems
AWS · Azure · GCP
Python · FastAPI · Django REST · Microservices · GPU inference