These are the official Python bindings to https://github.com/openvenues/libpostal, a fast statistical parser/normalizer for street addresses anywhere in the world.
from postal.expand import expand_address
expand_address('Quatre vingt douze Ave des Champs-Élysées')
from postal.parser import parse_address
parse_address('The Book Club 100-106 Leonard St, Shoreditch, London, Greater London, EC2A 4RH, United Kingdom')Before using the Python bindings, you must install the libpostal C library. Make sure you have the following prerequisites:
On Ubuntu/Debian
sudo apt-get install curl autoconf automake libtool python-dev pkg-config
On CentOS/RHEL
sudo yum install curl autoconf automake libtool python-devel pkgconfig
On Mac OSX
brew install curl autoconf automake libtool pkg-config
Installing libpostal
If you're using an M1 Mac, add --disable-sse2 to the ./configure command. This will result in poorer performance but the build will succeed.
git clone https://github.com/openvenues/libpostal
cd libpostal
./bootstrap.sh
./configure --datadir=[...some dir with a few GB of space...]
make
sudo make install
# On Linux it's probably a good idea to run
sudo ldconfig
To install the Python library, just run:
pip install postal
Installing libpostal on Windows
Install msys2 and launch a shell using the MSYS2 MingW 64-bit start menu option, not the usual MSYS2 MSYS option.
This is important because we don't want our libpostal.dll to link to msys-2.0.dll (Python seems to hang if you load this DLL).
Then:
pacman -S autoconf automake curl git make libtool gcc mingw-w64-x86_64-gcc
git clone https://github.com/openvenues/libpostal
cd libpostal
cp -rf windows/* ./
./bootstrap.sh
./configure --datadir=[...some dir with a few GB of space...]
make
make install
mkdir headers && cp -r /usr/include/libpostal/ headers/
Now start a command prompt which has access to the Microsoft toolchain. This can be done by e.g. installing the Windows 10 SDK and then running the x64 Native Tools Command Prompt.
Assuming your MSYS and Python are installed in some standard locations, you can use this command prompt to build+install the Python library like so:
lib.exe /def:libpostal.def /out:postal.lib /machine:x64
pip install postal --global-option=build_ext --global-option="-I[...libpostal checkout...]\headers" --global-option="-L[...libpostal checkout...]"
copy src\.libs\libpostal-1.dll "C:\Python36\Lib\site-packages\postal\libpostal.dll"
pypostal supports Python 2.7+ and Python 3.4+. These bindings are written using the Python C API and thus support CPython only. Since libpostal is a standalone C library, support for PyPy is still possible with a CFFI wrapper, but is not a goal for this repo.
Make sure you have nose installed, then run:
python setup.py build_ext --inplace
nosetests postal/tests
The build_ext --inplace business is needed so the C extensions build in the source checkout directory and are accessible/importable by the Python modules.