Starred repositories
Mathematical derivation and pure Python code implementation of machine learning algorithms.
A pytorch re-implementation of Real-time Scene Text Detection with Differentiable Binarization
基于pytorch的ocr算法库,包括 psenet, pan, dbnet, sast , crnn
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras
Generate text images for training deep learning ocr model
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
链家二手房租房在线数据,存量房交易服务平台数据,详细数据分析教程
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus
Language Identification and transliteration tool for Indian language code mixed data.
Neural Machine Translation with Attention (PyTorch)
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
including text classifier, language model, pre_trained model, multi_label classifier, text generator, dialogue. etc
Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch
An implementation of attention-based neural machine translation using Pytorch
The project aims on adding a state-of-the-art transliteration module for cross transliterations among all Indian languages including English.
Xlit-Crowd: Hindi-English Transliteration Corpus
Tutorial on English to Hindi Transliteration using Seq2Seq Architecture in Tensorflow
It is a simple tool to convert roman script to indic(Devanagari) script. As most Keyboards are English and to write in Indic script is difficult. It is easy to write Hindi in roman script this give…
An unsupervised stemmer for Natural Language Processing Tasks on Hinglish Language ( Hindi + English words )
A Hindi-English Dataset for Text Normalization