Skip to content
View saidrhs's full-sized avatar

Block or report saidrhs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,446 957 Updated Oct 24, 2025

Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.

Shell 365 149 Updated Nov 7, 2025

Exa MCP for web search and web crawling!

TypeScript 3,204 239 Updated Nov 7, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 1,978 220 Updated Nov 5, 2025

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Shell 4,746 1,292 Updated Nov 8, 2025

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication

Go 606 115 Updated Nov 4, 2025

Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling

Go 86 19 Updated Nov 7, 2025

DoEKS is a tool to build, deploy and scale Data Platforms on Amazon EKS

HCL 813 279 Updated Nov 8, 2025

Lightweight coding agent that runs in your terminal

Rust 50,038 6,191 Updated Nov 8, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,685 441 Updated Nov 4, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 78,189 11,552 Updated Nov 6, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,866 738 Updated Oct 15, 2025

NVIDIA Inference Xfer Library (NIXL)

C++ 704 181 Updated Nov 8, 2025

Perplexity GPU Kernels

C++ 527 68 Updated Nov 7, 2025

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Python 3,610 546 Updated Jul 18, 2025

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 681 98 Updated Oct 22, 2025

A neurosymbolic perspective on LLMs

Python 1,629 80 Updated Nov 6, 2025

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 1,994 226 Updated Nov 6, 2025

A best practices guide for day 2 operations, including operational excellence, security, reliability, performance efficiency, and cost optimization.

Python 2,137 546 Updated Oct 29, 2025

Flexible and powerful framework for managing multiple AI agents and handling complex conversations

Python 7,042 643 Updated Oct 21, 2025

Gateway API Inference Extension

Go 514 190 Updated Nov 8, 2025

A universal scalable machine learning model deployment solution

Java 240 82 Updated Nov 8, 2025

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,358 480 Updated Nov 8, 2025

Amazon Elastic Container Service Agent

Go 2,130 640 Updated Nov 6, 2025

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).

Shell 5,321 329 Updated Mar 25, 2025

This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models

Jupyter Notebook 1,271 584 Updated Oct 8, 2025

This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…

Jupyter Notebook 17,599 2,884 Updated Oct 30, 2025

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,673 319 Updated Aug 19, 2025

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 154,874 49,544 Updated Nov 8, 2025

High-performance In-browser LLM Inference Engine

TypeScript 16,766 1,131 Updated Nov 2, 2025
Next