Stars
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data generation.
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
Self-Supervised Speech Pre-training and Representation Learning Toolkit
The official website of SUPERB Benchmark
Examples for pre-training retrieval-extraction based language model
edX: XBlock for LectureScape, a video player driven by data and statistics, written by Juho Kim, and converted to an XBlock by Peter Githaiga, under our co-supervision
edX: An XBlock to recommend resources to other students, written by Daniel Li, under my supervision