khosravipasha

💤

Pasha Khosravi khosravipasha

💤

Computer Science, PhD

113 followers · 368 following

Achievements

Highlights

Organizations

Lists (2)

Sort

✨ Inspiration

1 repository

LLM

1 repository

Stars

huggingface / optimum-executorch

🤗 Optimum ExecuTorch

Python 86 27 Updated Nov 26, 2025

ml-explore / mlx-examples

Examples in the MLX framework

Python 8,009 1,112 Updated Nov 20, 2025

google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

C 2,182 451 Updated Nov 26, 2025

google-ai-edge / ai-edge-torch

Supporting PyTorch models with the Google AI Edge TFLite runtime.

Jupyter Notebook 846 125 Updated Nov 25, 2025

thomaspinder / GPJax

Gaussian processes in JAX and Flax.

Python 557 71 Updated Nov 25, 2025

shubham0204 / SmolChat-Android

Running any GGUF SLMs/LLMs locally, on-device in Android

Kotlin 581 83 Updated Nov 17, 2025

rmatif / Local-Diffusion

Run SD1.x/2.x/3.x, SDXL, and FLUX.1 on your phone device

Dart 60 6 Updated Jul 21, 2025

allenai / open-instruct

AllenAI's post-training codebase

Python 3,360 465 Updated Nov 26, 2025

pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch

Python 3,585 738 Updated Nov 26, 2025

google-ai-edge / LiteRT

LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via efficient conversion, runtime, and optimization

C++ 1,009 137 Updated Nov 26, 2025

Anemll / Anemll

Artificial Neural Engine Machine Learning Library

Python 1,258 46 Updated Nov 16, 2025

abetlen / llama-cpp-python

Python bindings for llama.cpp

Python 9,775 1,257 Updated Aug 15, 2025

google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

C++ 32,124 5,624 Updated Nov 25, 2025

google-ai-edge / gallery

A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.

Kotlin 14,396 1,209 Updated Nov 25, 2025

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 21,652 1,863 Updated Nov 25, 2025

marikgoldstein / slides

18 Updated Dec 31, 2023

filipstrand / mflux

A MLX port of FLUX and other state of the art diffusion image models based on the Huggingface Diffusers implementation.

Python 1,651 98 Updated Nov 26, 2025

KellerJordan / modded-nanogpt

NanoGPT (124M) in 3 minutes

Python 3,891 506 Updated Nov 23, 2025

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 2,015 227 Updated Nov 23, 2025

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,131 394 Updated Jul 11, 2024

NVIDIA-NeMo / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,185 3,210 Updated Nov 26, 2025

huggingface / smolagents

🤗 smolagents: a barebones library for agents that think in code.

Python 24,142 2,152 Updated Nov 19, 2025

n8n-io / n8n

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 158,863 50,910 Updated Nov 26, 2025

pytorch / helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 639 75 Updated Nov 26, 2025

gaogaotiantian / viztracer

A debugging and profiling tool that can trace and visualize python code execution

Python 7,404 467 Updated Nov 10, 2025

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,069 1,680 Updated Nov 26, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 20,413 3,523 Updated Nov 26, 2025

ikawrakow / ik_llama.cpp

llama.cpp fork with additional SOTA quants and improved performance

C++ 1,343 161 Updated Nov 26, 2025

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,414 2,273 Updated Nov 12, 2025

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,421 560 Updated Nov 26, 2025