-
Intel
- Taipei, Taiwan
- https://twitter.com/garywang
Stars
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
A command-line interface tool for serving LLM using vLLM.
Intel® AI Assistant Builder
A Datacenter Scale Distributed Inference Serving Framework
✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Token level visualization tools for large language models
OpenAI Triton backend for Intel® GPUs
Production-ready platform for agentic workflow development.
Basic install and use Gemma3 via ollama in Colab
An open-source AI agent that brings the power of Gemini directly into your terminal.
Model Context Protocol Servers
Ollama with intel (i)GPU acceleration in docker and benchmark
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Happy experimenting with MLLM and LLM models!
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Bjorn Services: an AI Microservices suite. LlaVA and BridgeTower component
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discr…
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Knowledge Base QA using RAG pipeline on Intel GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with BigDL-LLM
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡