Skip to content
View DawnyWu's full-sized avatar

Block or report DawnyWu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 49 3 Updated Oct 25, 2025

This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies

161 11 Updated Oct 31, 2025

Data and software for building the ACL Anthology.

Python 617 366 Updated Nov 27, 2025

How do you train retrievers to find inspirations? [ACL 2025]

6 Updated Aug 18, 2025

MultiCite code and data. Models are available on Huggingface.

Python 32 5 Updated May 10, 2022
Python 5 1 Updated Sep 23, 2025

Python PDF parser for scientific publications: content and figures

Python 444 67 Updated Mar 21, 2024

This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…

Jupyter Notebook 15,416 2,011 Updated Oct 30, 2025

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 23,225 2,654 Updated Oct 30, 2025

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Go 40,856 3,652 Updated Nov 29, 2025

Distribute and run LLMs with a single file.

C 23,436 1,242 Updated Nov 24, 2025

Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)

Python 12,057 1,304 Updated Jul 5, 2025

Code base for ICLR 2024 "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature".

Python 356 64 Updated Sep 13, 2025

[ICML 2024] Binoculars: Zero-Shot Detection of LLM-Generated Text

Python 323 48 Updated May 14, 2024

Deploy headless browsers in Docker. Run on our cloud or bring your own. Free for non-commercial uses.

TypeScript 11,922 921 Updated Nov 28, 2025

A knowledge graph unifying computational and experimental data for MOFs

Jupyter Notebook 28 10 Updated Nov 14, 2025

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)

Python 42 25 Updated Nov 25, 2025

Robin: A multi-agent system for automating scientific discovery

Python 263 35 Updated Nov 24, 2025

Python client for GROBID Web services

Python 378 81 Updated Nov 19, 2025

Knowledge Base is important [Accepted in NeurIPS 2024]

Python 11 2 Updated Nov 1, 2025

Curated resources for discovering, reading, and working with arXiv papers

358 13 Updated Jun 4, 2025

Memory for AI Agents in 6 lines of code

Python 9,414 868 Updated Nov 28, 2025
Python 191 36 Updated Nov 25, 2025

Evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology

Python 92 11 Updated Sep 27, 2025

LitQA Eval: A difficult set of scientific questions that require context of full-text research papers to answer

Python 43 5 Updated Dec 18, 2024

Pyzotero: a Python client for the Zotero API

Python 1,138 122 Updated Nov 24, 2025

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.

Rust 10,584 496 Updated Nov 28, 2025

Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

Rust 14,074 826 Updated Nov 26, 2025
Python 31 2 Updated Oct 30, 2023

Data and tools for generating and inspecting OLMo pre-training data.

Python 1,351 158 Updated Nov 5, 2025
Next