Skip to content
View caifand's full-sized avatar

Block or report caifand

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
HTML 1 Updated Sep 2, 2023

A large-scale open data lake for the science of science research.

Jupyter Notebook 97 13 Updated Jun 2, 2025

TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/understand tweets such as sentiment analysis, emoji prediction,…

Python 365 35 Updated Apr 2, 2025
Jupyter Notebook 339 31 Updated Jan 3, 2024

String-to-String Algorithms for Natural Language Processing

Jupyter Notebook 561 31 Updated Jul 26, 2024

Guess gender from first name in Python 2 and 3

Python 138 32 Updated May 20, 2025

How random is the review outcome? A systematic study of the impact of external factors on eLife peer review

Python 2 Updated May 6, 2021

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Python 1,250 151 Updated Jul 24, 2025

Contextualised Topic Coherence Metrics: A new way to evaluate neural topic models.

Python 8 1 Updated Oct 15, 2023

Minimal keyword extraction with BERT

Python 4,061 376 Updated Oct 23, 2025

State-of-the-Art Text Embeddings

Python 17,925 2,717 Updated Nov 21, 2025

codes and data to produce main results of the paper "Scientific Prizes and the Extraordinary Growth of Scientific Topics".

MATLAB 2 Updated Aug 17, 2021

Systematic dataset of Covid-19 policy, from Oxford University

775 431 Updated Jun 22, 2023

Back end for producing indicators and loading them into the COVIDcast API.

Python 12 16 Updated Nov 21, 2025

BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)

Python 601 54 Updated Jul 22, 2024

A Python wrapper around the topic modeling functions of MALLET.

Jupyter Notebook 105 17 Updated Nov 1, 2024

Tracking Emotional Compositions of Online Discourse Before and After the COVID-19 Outbreak

2 Updated Feb 9, 2022

Public repo for SEIR Austin code base

R 4 4 Updated Feb 3, 2022
Jupyter Notebook 11 2 Updated Feb 20, 2021

A repository of data on coronavirus cases and deaths in the U.S.

6,987 3,424 Updated Apr 2, 2024
JavaScript 17 1 Updated Feb 1, 2024

Flexible calculation of moral foundation scores from textual input data based on word embedding methods.

Python 46 13 Updated Mar 22, 2023

Free and Open Source, Distributed, RESTful Search Engine

Java 75,554 25,646 Updated Nov 30, 2025

Top2Vec learns jointly embedded topic, document and word vectors.

Python 3,104 377 Updated Nov 14, 2024

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Python 7,214 867 Updated Nov 11, 2025

Interpretable data visualizations for understanding how texts differ at the word level

Python 284 30 Updated Feb 12, 2025

A large-scale COVID-19 specific geotagged global tweets dataset. Associated paper: https://doi.org/10.1016/j.asoc.2022.109603

6 Updated Mar 29, 2023

The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.

Python 723 302 Updated Feb 22, 2023

A collection of Jupyter notebooks, each walking you through a common example of bibliometric analysis using scholarly data from the OpenAlex API.

Jupyter Notebook 124 31 Updated Apr 25, 2024
Next