Skip to content
View hoyathalis's full-sized avatar
💭
Open to MLE Roles!!! Reachout!!!
💭
Open to MLE Roles!!! Reachout!!!

Block or report hoyathalis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hoyathalis/README.md

Hi, I'm Hoyath Ali

About Me

I’m a Machine Learning Engineer focused on building scalable ML systems for on-device and edge inference across diverse hardware accelerators.
At Axiado Corporation, I drive initiatives in large scale training, performance optimization, and ML system design, ensuring models move efficiently from research to deployment.


Education

  • M.S. in Computer Science – University of California, Riverside (GPA: 3.93)
  • B.S. in Computer Science & Engineering – Lovely Professional University, India

Writing & Publications

  • Medium – Technical deep dives on ML systems, distributed training, and performance engineering.

📫 Connect with Me


Pinned Loading

  1. distributed_playground distributed_playground Public

    Minimal PyTorch playground for benchmarking tensor parallelism, compares row-wise vs column-wise splits with NCCL profiling and TensorBoard analysis.

    Python

  2. packet-normalization-cuda packet-normalization-cuda Public

    Optimized CUDA kernel for network packet normalization: 5× faster than PyTorch for real-time ML preprocessing in intrusion detection systems.

    Python

  3. MultiGPUMatMul MultiGPUMatMul Public

    Forked from hoyathali/MultiGPUMatMul

    This project leverages multiple Graphics Processing Units (GPUs) to accelerate matrix multiplication across distributed networks. By dividing large matrices into smaller sections (Bands) and distri…

    Cuda

  4. bedtime.ai bedtime.ai Public

    This project creates a personalized bedtime story system that leverages AI advancements to enhance children's bedtime experiences. By using voice cloning technology, stories are narrated in a paren…

    Jupyter Notebook 2

  5. stablediff_anime stablediff_anime Public

    Stable Diffusion model implemented from scratch and trained on 64x64 anime face dataset. Features complete training pipeline, custom noise scheduling, and inference scripts for generating anime-sty…

    Python

  6. GameOfLife3D GameOfLife3D Public

    Forked from hoyathali/GameOfLife3D

    This implementation provides an interactive visualization of Conway's Game of Life in a 3D space using Python, Numba, Mayavi running on GPU

    Python