TesseraPy

TesseraPy provides a minimal wrapper for HP/Google's Tesseract OCR library, making use of CFFI instead of running the tesseract executable like other Python wrappers.

Currently (v0.1.4), it is extremely minimal having a class called tesseract with a single method get_text() which takes a PIL image as an argument and returns the text in the image.

So e.g.

import TesseraPy
from PIL import Image

i = Image.open("test.png")
t = TesseraPy.tesseract()
txt = t.get_text(i)
print(txt)

Installation

pip install git+https://github.com/owainkenwayucl/TesseraPy

To install a specific version, e.g. the first release, v0.1.4

pip install git+https://github.com/owainkenwayucl/[email protected]

Configuration

You can control where TesseraPy looks for the various parts of Tesseract with the following environment variable:

TESSERACT_LIBRARY - path to the Tesseract library - default: /lib64/libtesseract.so.5.3.4
TESSERACT_DATA - path to Tesseract data files - default: /usr/share/tesseract/tessdata/
TESSERACT_ENCODING - encoding - default: utf-8
TESSERACT_LANGUAGE - language - default: eng

The defaults are those of the development platform - AlmaLinux 10.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
TesseraPy		TesseraPy
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TesseraPy

Installation

Configuration

About

Uh oh!

Releases 1

Packages

Languages

License

owainkenwayucl/TesseraPy

Folders and files

Latest commit

History

Repository files navigation

TesseraPy

Installation

Configuration

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages