ML/NLP/LLM Engineer with expertise in AI Systems Architecture, Machine Learning, and Deep Learning. Specialized in building scalable AI systems, developing classical ML/DL models, implementing traditional NLP solutions, integrating large language models into production environments, and managing full development lifecycle from architecture to deployment.
Core competency lies in combining modern approaches (LLM, multi-agent systems, RAG) with proven classical ML and DL methodologies to ensure system stability, predictability, and high performance.
- Development of RAG and GraphRAG systems
- Model fine-tuning (LoRA, QLoRA, PEFT) for domain-specific applications
- Inference optimization (vLLM, TensorRT, llama.cpp, Ollama)
- Advanced prompt engineering (Zero-shot, Few-shot, CoT, ReAct, Planning)
- Multi-agent system architecture (LangGraph, AutoGEN, Planning Agents, Langchain)
- Agent integration with APIs and external services
- Dynamic tool selection systems
- Regression models (Linear, Ridge, Lasso) and classification algorithms (Logistic Regression, SVM, Decision Trees, Random Forest)
- Ensemble methods (Gradient Boosting, XGBoost, LightGBM, CatBoost)
- Clustering techniques (K-Means, DBSCAN, Hierarchical Clustering)
- Feature engineering, hyperparameter tuning, model validation
- Neural network development and training with PyTorch (MLP, CNN, RNN, LSTM, GRU)
- Transfer learning and fine-tuning of pre-trained models (ResNet, EfficientNet, BERT)
- Architecture optimization, regularization, scheduler implementation
- Large-scale dataset handling and GPU-accelerated training
- Text preprocessing: tokenization, stemming, lemmatization, stop-word removal
- Text vectorization (Bag-of-Words, TF-IDF, Word2Vec, FastText, GloVe)
- Text classification, sentiment analysis, topic modeling (LDA)
- Chatbot and dialogue system development using traditional NLP methods
- Integration of NLTK, spaCy, gensim into ML projects
- REST API development with FastAPI
- Data storage and caching with PostgreSQL and Redis
- API optimization for high-load environments
- Containerization (Docker, Docker Compose)
- CI/CD pipelines (GitHub Actions, GitLab CI)
- Model monitoring, logging, and management (MLFlow, LangSmith)
- Implementation and optimization of vector search (ChromaDB, Pinecone, Weaviate, FAISS)
- Hybrid search system development
- Implemented Enterprise RAG system with corporate process integration and hybrid search support
- Developed multi-agent platform using LangGraph for educational process automation
- Built GraphRAG Knowledge System utilizing Neo4j and LLM for semantic search
- Developed and deployed classical ML models for price prediction, data classification, and risk assessment
- Trained and optimized CNN and LSTM architectures for image analysis and sequence processing tasks
- Mentored junior engineers, established development standards, conducted code reviews
- Successfully transitioned multiple AI products from prototype to stable production deployment
Tanym (Astana) | NLP/LLM Engineer
December 2024 — Present
- Lead developer of NLP/LLM modules in AI assistant platform
- Multi-agent system development and LLM integration into educational workflows
- RAG pipeline implementation, API development, and service containerization
- Inference optimization and generation quality enhancement
Programming Languages: Python, C++
ML/DL Frameworks: PyTorch, scikit-learn, XGBoost, LightGBM, CatBoost, numpy, pandas
LLM Tools: LangChain, LangGraph, AutoGEN, vLLM, Hugging Face, OpenAI API (and others)
NLP Libraries: NLTK, spaCy, Word2Vec, FastText, TF-IDF
Databases & Search: PostgreSQL, Redis, ChromaDB, Pinecone, Weaviate, FAISS, pgvector
MLOps: Docker, Docker Compose, GitHub Actions, MLFlow, LangSmith, ClearML
Inference Optimization: vLLM, TensorRT, llama.cpp, Ollama