Highlights
- Pro
Stars
These are lists for a variety of languages containing words that are distinctive to each language.
Tools for extracting parallel corpora from article titles across languages in Wikipedia
take a giant list of names, check which names do not have a Wikipedia entry, and then spit out that resultant set
python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. With this module, all functionality exposed through the C++ in…
A simple and fast discriminative sequence labeling toolkit ( http://wapiti.limsi.fr )
Extensions to MaltParser so that you can plug in classifiers from Weka easily.
Migrate issue tracker issues from Google Code to github
Python module for interacting with Wikipedia's API with an interest in multilingual capabilities
A tool for translating wikipedia articles into other languages