Skip to content
View ShimBoi's full-sized avatar

Highlights

  • Pro

Block or report ShimBoi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ShimBoi/README.md

πŸ‘‹ Hi, I'm Jay

ML Engineer | Building AI from First Principles

LinkedIn Medium Email


πŸ”¬ Currently Working On

Reconstructing the Transformer from Scratch

  • Rebuilt the original "Attention Is All You Need" architecture from paper to production
  • Achieved 15.34 BLEU on WMT'14 DEβ†’EN translation
  • Implemented: Multi-head attention, sinusoidal embeddings, gradient accumulation, custom LR scheduling
  • πŸ“ Read the deep dive on Medium

🎯 Featured Projects

Transformer from Scratch

Full reimplementation of "Attention Is All You Need"

  • Encoder-decoder architecture with 65M parameters
  • Multi-head attention mechanisms
  • Custom training pipeline with gradient accumulation
  • View Project β†’

MJX-purejaxrl

Integration of MJX into purejaxrl

  • Used Madrona GPU Rendering for ultra-fast environment rendering
  • Trained CNN policy using PPO for MJX cube pick task
  • Easy integration with existing MLP PPO implementation
  • View Project β†’

πŸ“ˆ GitHub Stats

Top Languages


πŸ“ Recent Writing


πŸŽ“ Education & Background

  • B.S. in Computer Science Honors (Turing Scholar) - The University of Texas at Austin
  • Focus: Understanding ML at a fundamental level through implementation

πŸ’¬ Let's Connect

If you're interested in my work, feel free to reach out!

Pinned Loading

  1. AttentionIsAllYouNeed AttentionIsAllYouNeed Public

    Recreating from scratch the original encoder-decoder transformers architecture

    Python

  2. MJX-purejaxrl MJX-purejaxrl Public

    Forked from luchris429/purejaxrl

    Really Fast End-to-End Jax RL Implementations

    Python

  3. RLBook2020 RLBook2020 Public

    My answers to the exercises and code to experiment

    Python

  4. zero-shot-2 zero-shot-2 Public

    Jupyter Notebook