-
Amazon
- Seattle, WA
- https://www.saidarahas.com
Starred repositories
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
Exa MCP for web search and web crawling!
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
DoEKS is a tool to build, deploy and scale Data Platforms on Amazon EKS
Lightweight coding agent that runs in your terminal
Democratizing Reinforcement Learning for LLMs
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
Achieve state of the art inference performance with modern accelerators on Kubernetes
A best practices guide for day 2 operations, including operational excellence, security, reliability, performance efficiency, and cost optimization.
Flexible and powerful framework for managing multiple AI agents and handling complex conversations
Gateway API Inference Extension
A universal scalable machine learning model deployment solution
Cost-efficient and pluggable Infrastructure components for GenAI inference
This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
High-performance In-browser LLM Inference Engine