-
University of Wisconsin-Madison
Highlights
- Pro
Starred repositories
Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Large Language Models.
Triton implementation of FlashAttention2 that adds Custom Masks.
Codes for papers on Large Language Models Personalization (LaMP)
This repository contains the dataset and the PyTorch implementations of the models from the paper Recognizing Emotion Cause in Conversations.
SGLang is a fast serving framework for large language models and vision language models.
Environments for LLM Reinforcement Learning
Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Curated list of datasets and tools for post-training.
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
The ultimate RAG for your monorepo. Query, understand, and edit multi-language codebases with the power of AI and knowledge graphs
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
⚡ TabPFN: Foundation Model for Tabular Data ⚡
Lightweight coding agent that runs in your terminal
Web Crawling and RAG Capabilities for AI Agents and AI Coding Assistants
CodeRAG is an AI-powered tool for real-time codebase querying and augmentation using OpenAI and vector search.
This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.
Build resilient language agents as graphs.
A multi-choice benchmark to evaluate LLM performance on PyChrono API usage
This is an official implementation of "DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching" (NeurIPS 2024)
Multivariate Time Series Transformer, public version