Skip to content
View lhoestq's full-sized avatar
🤗
🤗

Organizations

@huggingface

Block or report lhoestq

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,230 407 Updated Oct 27, 2025

Training Model Behavior in Agentic Systems

Python 651 45 Updated Nov 9, 2025

Apache DataFusion SQL Query Engine

Rust 7,999 1,748 Updated Nov 10, 2025

Apache DataFusion Python Bindings

Python 519 133 Updated Nov 8, 2025

Training LLMs to reason and analyze data with notebooks

Python 49 5 Updated Sep 10, 2025

Train LLM on Hugging Face infra

Python 66 9 Updated Sep 10, 2025

parquet file parser for javascript

JavaScript 703 31 Updated Nov 4, 2025

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

TypeScript 39,209 2,374 Updated Nov 10, 2025

A VSCode extension to use Hugging Face Inference Providers in Copilot Chat

TypeScript 39 17 Updated Oct 30, 2025

Synthetic Online Conversations

Python 3 Updated Aug 17, 2025

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

TypeScript 4,059 207 Updated Oct 28, 2025

PySpark custom data source for Hugging Face Datasets

Python 19 6 Updated Aug 12, 2025

Low-level communication with Reachy Mini motors

Rust 39 6 Updated Nov 5, 2025

Apache OpenDAL: One Layer, All Storage.

Rust 4,561 656 Updated Nov 9, 2025

PyTorch media decoding and encoding

Python 799 67 Updated Nov 9, 2025

Set up your GitHub Actions workflow with ffmpeg

JavaScript 134 23 Updated Mar 22, 2024

Fast parquet command line tool with many functions, nailed it!

Rust 26 2 Updated Aug 10, 2025

A lightweight, local-first, and 🆓 experiment tracking library from Hugging Face 🤗

Python 1,068 66 Updated Nov 7, 2025

Metadata extraction and validation in scientific papers

Python 11 3 Updated Oct 7, 2025

Build, enrich, and transform datasets using AI models with no code

TypeScript 1,559 131 Updated Oct 23, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,157 7,522 Updated Nov 6, 2025

The official repository of Mozilla's Firefox web browser.

JavaScript 10,436 656 Updated Nov 10, 2025

Efficient BM25 with DuckDB 🦆

Python 58 2 Updated Dec 20, 2024

Craft conversational datasets (JSONL format with rich metadata) using LLMs. Specify parameters manually or use a creative brief for LLM-generated arguments with automatic topic/scenario variation. …

Python 9 1 Updated Apr 17, 2025

Plug-and-play, zero-shot document processing pipelines.

Python 112 8 Updated Nov 8, 2025

ParquetToHuggingFace processes raw audio data, converts it into Parquet files, and uploads them to Hugging Face. The README explains how to set up the environment, configure paths, and run the scri…

Python 8 2 Updated May 16, 2025

OSX hfjobs menubar app

Swift 5 Updated Jul 21, 2025

MCP server for Hugging Face dataset viewer

Python 30 12 Updated Apr 25, 2025

A 15TB Collection of Physics Simulation Datasets

Jupyter Notebook 1,152 96 Updated Nov 5, 2025
Next