Minho Ryu bzantium

Minho Ryu (bzantium)

Senior AI Research Engineer | Foundation Models & Scalable Systems | Google Developer Expert (AI)

👋 About Me

I am a Senior AI Research Engineer and Google Developer Expert (AI) with a mission to architect and scale the powerful, efficient, and accessible large language models that will define the future.

My expertise covers the full lifecycle of foundation models: from curating massive datasets and architecting cutting-edge training infrastructure to developing production-grade models that set new performance benchmarks. I thrive on solving complex, large-scale challenges and am deeply invested in strengthening the open-source ecosystem that fuels global AI innovation.

🚀 Key Professional Highlights

Foundation Model Development: Co-led the end-to-end pre-training of Kakao's Kanana V1 foundation model from a 3T token dataset and implemented compute-efficient scaling techniques like Pruning & Distillation. I also spearheaded key enhancements for Kanana-1.5 (including its 128K long-context extension) and owned the full development of a production embedding model that surpassed larger competitors.
Scalable AI Infrastructure: Architected and optimized a cutting-edge, scalable LLM training pipeline from the ground up using JAX, MaxText, and TPUs. This work was featured in an official Google Cloud Blog Post and my expertise was recognized with a presentation at Google Cloud Next 2025 (YouTube).
Open Source Leadership: As a Research Lead at EleutherAI, I co-led the development of Polyglot-Ko, the first open-source Korean large language model, successfully training and releasing models up to 12.8B parameters.

🐙 Open Source Contributions

My GitHub activity reflects a consistent track record of contributing high-impact code to the core of the modern AI ecosystem. I focus on strengthening foundational libraries, building scalable systems, and advancing rigorous evaluation. Below are some of my key contributions:

Hugging Face Transformers: Led the end-to-end integration of the DeepSeek-V3 model, a complex process that spanned its core architecture, unique RoPE scaling logic, and follow-on capabilities like token classification heads.
Google's MaxText: Architected and implemented the multi-source data blending feature, significantly enhancing the data pipeline for Google's flagship JAX-based large-scale training framework.
LM-Eval Harness: Expanded the community's model evaluation capabilities by adding and integrating major benchmarks, including global_mmlu, hrm8k, and humaneval+.
Broader Ecosystem: My contributions also span high-performance distributed training concepts like Tensor Parallelism in EleutherAI's OSLO, enabling distributed training for generative vision models, and implementing new model support in LLM2Vec.

✍️ Publications

Kanana: Compute-efficient Bilingual Language Models. [arXiv:2502.18934]
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models. [arXiv:2306.02254]
Knowledge distillation for bert unsupervised domain adaptation. [arXiv:2010.11478]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minho Ryu bzantium

Achievements

Achievements

Block or report bzantium