scarletcho

🍐

Yejin Cho scarletcho

🍐

PhD student in Computational Linguistics at the University of Texas at Austin. Interested in computational semantics and language modeling.

67 followers · 55 following

University of Texas at Austin
Austin, TX

Achievements

Starred repositories

google-gemini / gemini-cli

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 84,527 9,566 Updated Nov 25, 2025

awebson / congressional_adversary

For our EMNLP 2020 paper “Are ‘Undocumented Workers’ the Same as ‘Illegal Aliens’? Disentangling Denotation and Connotation in Vector Spaces”.

Python 12 Updated Dec 4, 2020

avi-otterai / SWOW-eval

Intrinsic Evaluation of pre-trained word embeddings, using large Word Association Dataset: SWOW (Small World of Words)

Jupyter Notebook 11 Updated Feb 28, 2024

LLMWorldOfWords / LWOW

A Github repository containing the LWOW project.

Python 13 2 Updated Oct 3, 2025

marcospln / homonymy_acl21

Python 6 Updated Sep 8, 2021

cambridgeltl / MirrorWiC

[CoNLL'21] MirrorWiC: On Eliciting Word-in-Context Representationsfrom Pretrained Language Models

Python 12 5 Updated Oct 31, 2021

taufeeque9 / codebook-features

Sparse and discrete interpretability tool for neural networks

Python 64 5 Updated Feb 12, 2024

stanfordnlp / pyvene

Stanford NLP Python library for understanding and improving PyTorch models via interventions

Python 833 93 Updated Oct 13, 2025

stanfordnlp / pyreft

Stanford NLP Python library for Representation Finetuning (ReFT)

Python 1,535 130 Updated Feb 6, 2025

khangich / machine-learning-interview

Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.

11,653 1,909 Updated Aug 31, 2023

shonosuke / ishiwatari-naacl2019

Learning to Describe Unknown Phrases with Local and Global Contexts

Python 21 1 Updated Jun 21, 2022

ltgoslo / definition_modeling

Interpretable Word Sense Representations via Definition Generation

Python 9 2 Updated Mar 6, 2025

andrewyng / aisuite

Simple, unified interface to multiple Generative AI providers

Python 12,821 1,307 Updated Nov 11, 2025

bmmlab / compositional-semantics-eval

Code relating to evaluation of models of compositional sentence semantics.

Jupyter Notebook 3 2 Updated Oct 17, 2024

cocoxu / SemEval-PIT2015

data and scripts for the shared task "Task 1: Paraphrase and Semantic Similarity in Twitter (PIT)" at SemEval 2015

Python 43 11 Updated Nov 10, 2020

karpathy / neuraltalk

NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.

Python 5,451 1,328 Updated Dec 22, 2020

brandeis-llc / L-CSTS

ACL 2024 - Linguistically Conditioned Semantic Textual Similarity

5 Updated Sep 5, 2024

maria-antoniak / storyseeker

Data, codebook, and models to automatically detect storytelling.

Jupyter Notebook 26 1 Updated Apr 23, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,032 2,665 Updated Aug 12, 2024

ari-holtzman / newformer

15 Updated Jul 20, 2023

dallascard / media_frames_corpus

A set of media framing annotations, along with scripts for obtaining the corresponding news articles

Python 54 9 Updated Jun 11, 2019

princeton-nlp / c-sts

[EMNLP 2023] C-STS: Conditional Semantic Textual Similarity

Python 73 7 Updated May 23, 2024

google-research-datasets / Crisscrossed-Captions

Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO

Python 54 3 Updated Sep 3, 2020

ollama / ollama

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 156,530 13,739 Updated Nov 22, 2025

meta-llama / llama-models

Utilities intended for use with Llama models.

Python 7,352 1,276 Updated Oct 10, 2025

llamastack / llama-stack-apps

Agentic components of the Llama Stack APIs

4,281 638 Updated Aug 5, 2025

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 30,230 4,034 Updated Jul 17, 2024

beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

Python 2,007 225 Updated Oct 16, 2025

google-research / t5x_retrieval

Python 101 10 Updated Dec 17, 2022

embeddings-benchmark / mtebpaper

Resources & scripts for the paper "MTEB: Massive Text Embedding Benchmark"

Python 18 4 Updated Sep 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yejin Cho scarletcho

Achievements

Achievements

Block or report scarletcho

Starred repositories

google-gemini / gemini-cli

awebson / congressional_adversary

avi-otterai / SWOW-eval

LLMWorldOfWords / LWOW

marcospln / homonymy_acl21

cambridgeltl / MirrorWiC

taufeeque9 / codebook-features

stanfordnlp / pyvene

stanfordnlp / pyreft

khangich / machine-learning-interview

shonosuke / ishiwatari-naacl2019

ltgoslo / definition_modeling

andrewyng / aisuite

bmmlab / compositional-semantics-eval

cocoxu / SemEval-PIT2015

karpathy / neuraltalk

brandeis-llc / L-CSTS

maria-antoniak / storyseeker

haotian-liu / LLaVA

ari-holtzman / newformer

dallascard / media_frames_corpus

princeton-nlp / c-sts

google-research-datasets / Crisscrossed-Captions

ollama / ollama

meta-llama / llama-models

llamastack / llama-stack-apps

tatsu-lab / stanford_alpaca

beir-cellar / beir

google-research / t5x_retrieval

embeddings-benchmark / mtebpaper

Starred topics

graph-neural-networks