Skip to content

Conversation

@betatim
Copy link
Member

@betatim betatim commented Sep 23, 2025

Downloading the treebank dataset in NLTK can be flaky and we probably do not need to use >10000 words to establish if the stemmer works or not. This hardcodes a random subset of the words.

Fixes #7221

Downloading the treebank dataset in NLTK can be flaky and we probably do
not need to use >10000 words to establish if the stemmer works or not.
@betatim betatim requested a review from a team as a code owner September 23, 2025 15:26
@betatim betatim requested a review from divyegala September 23, 2025 15:26
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Sep 23, 2025
@betatim betatim added the non-breaking Non-breaking change label Sep 23, 2025
@betatim betatim added the improvement Improvement / enhancement to an existing function label Sep 23, 2025
@betatim
Copy link
Member Author

betatim commented Sep 24, 2025

/merge

@rapids-bot rapids-bot bot merged commit 1f2607c into rapidsai:branch-25.10 Sep 24, 2025
101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] NLTK treebank corpus download failure in stemmer tests

2 participants