Overview • Quick start • Schema development • Test • Debug • Containers
Requirements:
Docker compose environment (based on pycsw) for development and testing with CKAN Open Data portals.1
Tip
It can be easily tested with a CKAN-type Open Data portal deployment: mjanez/ckan-docker2.
Available components:
- pycsw: The pycsw app. An OARec and OGC CSW server implementation written in Python.
- ckan2pycsw: Software to achieve interoperability with the open data portals based on CKAN. To do this,
ckan2pycswreads data from an instance using the CKAN API, generates INSPIRE ISO-19115/ISO-19139 3 metadata using pygeometa, or another custom schema, and populates a pycsw instance that exposes the metadata using CSW and OAI-PMH.
Copy the .env.example template and configure by changing the .env file. Change PYCSW_URL and CKAN_URL, as well as the published port PYCSW_PORT, if needed.
cp .env.example .envSelect the CKAN Schema (PYCSW_CKAN_SCHEMA), and the pycsw output schema (PYCSW_OUTPUT_SCHEMA):
- Default:
PYCSW_CKAN_SCHEMA=iso19139_geodcatap PYCSW_OUTPUT_SCHEMA=iso19139_inspire ... SSL_UNVERIFIED_MODE=True
- Avalaible:
-
CKAN metadata schema (
PYCSW_CKAN_SCHEMA):iso19139_geodcatap, default: [WIP] Schema based on GeoDCAT-AP custom dataset schema.iso19139_base: [WIP] Base schema.
-
pycsw metadata schema (
PYCSW_OUTPUT_SCHEMA):iso19139_inspire, default: Customised schema based on ISO 19139 INSPIRE metadata schema. 4iso19139: Standard pycsw schema based on ISO 19139.
-
Change SSL_UNVERIFIED_MODE to avoid SSL errors when using a self-signed certificate in CKAN development.
- Default:
SSL_UNVERIFIED_MODE=True
Warning
Enabling SSL_UNVERIFIED_MODE can expose your application to security risks by allowing unverified SSL certificates. Use this setting only in a trusted development environment and never in production.
To deploy the environment, docker compose will build the latest source in the repo.
If you can deploy a 5 minutes image, use the stable image (ghcr.io/mjanez/ckan-pycsw:main) with docker-compose.ghcr.yml
git clone https://github.com/mjanez/ckan-pycsw
cd ckan-pycsw
docker compose up --build
# Github main registry image
docker compose -f docker-compose.ghcr.yml --build
# Or detached mode
docker compose up -d --buildTip
Deploy the dev (multistage build) docker-compose.dev.yml with:
docker compose -f docker-compose.dev.yml up --buildIf needed, to build a specific container simply run:
docker build -t target_name xxxx/Requirements:
>=Python 3.9
Dependencies:
python3 -m pip install --user pipx
python3 -m pipx ensurepath --force
# You will need to open a new terminal or re-login for the PATH changes to take effect.
pipx install pdm
pdm install --no-self --group prodConfiguration:
PYCSW_URL=http://localhost:8000 envsubst < ckan-pycsw/conf/pycsw.conf.template > pycsw.conf
# Or update pycsw.conf vars manually
vi pycsw.confGenerate database and add:
rm -f cite.db
# Remember create and update .env vars. Next add to .env environment:
bash doc/scripts/00_ennvars.shRun ckan2pycsw:
PYCSW_CONFIG=pycsw.conf pdm run python3 ckan2pycsw/ckan2pycsw.pyUser-defined metadata schemas can be added, both for CKAN metadata input: ckan2pycsw/schemas/ckan/* and for output schemas in pycsw: ckan2pycsw/schemas/pygeometa/*.
You can customise and extend the metadata schemas that serve as templates to import as many metadata elements as possible from a custom schema into CKAN. e.g. Based on a custom schema from ckanext-scheming.
-
Create a new folder in
schemas/ckan/with the name intended for the schema. e.g.iso19139_spain. -
Create the
main.j2with the Jinja template to render the metadata.Examples in: `schemas/ckan/iso19139_geodcatap -
Add all needed mappings (
.yaml) to a new folder inckan2pycsw/mappings/. e.g.iso19139_spain -
Update
ckan2pycsw/mappings/ckan-pycsw_assigments.yamlto include the pycsw and ckan schema mapping. e.g.iso19139_geodcatap: ckan_geodcatap iso19139_base: ckan_base iso19139_inspire: inspire ... iso19139_spain: iso19139_spain
-
Modify
.envto select the newPYCSW_CKAN_SCHEMA:PYCSW_CKAN_SCHEMA=iso19139_spain PYCSW_OUTPUT_SCHEMA=iso19139
New metadata schemas can be extended or added to convert elements extracted from CKAN into standard metadata profiles that can be exposed in the pycsw CSW Catalogue.
-
Create a new folder in
schemas/pygeometa/with the name intended for the schema. e.g.iso19139_spain. -
Add a
__init__.pyfile with the extended pygeometa schema class. e.g.import ast import logging import os from typing import Union from lxml import etree from owslib.iso import CI_OnlineResource, CI_ResponsibleParty, MD_Metadata from pygeometa.schemas.base import BaseOutputSchema from model.template import render_j2_template LOGGER = logging.getLogger(__name__) THISDIR = os.path.dirname(os.path.realpath(__file__)) class ISO19139_spainOutputSchema(BaseOutputSchema): """ISO 19139 - Spain output schema""" def __init__(self): """ Initialize object :returns: pygeometa.schemas.base.BaseOutputSchema """ super().__init__('iso19139_spain', 'xml', THISDIR) ...
-
Create the
main.j2with the Jinja template to render the metadata, macros can be added for more specific templates, for example:iso19139_inspire-regulation.j2, orcontact.j2, more examples in:schemas/pygeometa/iso19139_inspire -
Add the Python class and the schema identifier to
ckan2pycsw.py, e.g.from schemas.pygeometa.iso19139_inspire import ISO19139_inspireOutputSchema, ISO19139_spainOutputSchema ... OUPUT_SCHEMA = { 'iso19139_inspire': ISO19139_inspireOutputSchema, 'iso19139': ISO19139OutputSchema, 'iso19139_spain: ISO19139_spainOutputSchema }
-
Add all mappings (
.yaml) to a new folder inckan2pycsw/mappings/. e.g.iso19139_spain -
Update
ckan2pycsw/mappings/ckan-pycsw_assigments.yamlto include the pycsw and ckan schema mapping. e.g.iso19139_geodcatap: ckan_geodcatap iso19139_base: ckan_base iso19139_inspire: inspire ... iso19139_spain: iso19139_spain
-
Modify
.envto select the newPYCSW_OUTPUT_SCHEMA:PYCSW_CKAN_SCHEMA=iso19139_geodcatap PYCSW_OUTPUT_SCHEMA=iso19139_spain
Perform a GetRecords request and return all:
{PYCSW_URL}?request=GetRecords&service=CSW&version=3.0.0&typeNames=gmd:MD_Metadata&outputSchema=http://www.isotc211.org/2005/gmd&elementSetName=full
- The
ckan-pycswlogs will be created in the/logfolder. - Metadata records in
XMLformat (ISO 19139) are stored in the/metadatafolder.
Note The
GetRecordsoperation allows clients to discover resources (datasets). The response is anXMLdocument and the output schema can be specified.
- Build and run container.
- Attach Visual Studio Code to container.
- Start debugging on
ckan2pycsw.pyPython file (Debug the currently active Python file) in the container.
- Update the previously created
.envfile in the root of theckan-ogcrepo and move it to:/ckan2pycsw - Open
ckan2pycsw.py. - Start debugging on
ckan2pycsw.pyPython file (Debug the currently active Python file).
Note
By default, the Python extension looks for and loads a file named .env in the current workspace folder. More info about Python debugger and Enviromental variables use.
List of containers:
| Repository | Type | Docker tag | Size | Notes |
|---|---|---|---|---|
| python 3.11 | base image | python/python:3.11-slim-bullseye |
45.57 MB | - |
| Repository | Type | Docker tag | Size | Notes |
|---|---|---|---|---|
| mjanez/ckan-pycsw | custom image | mjanez/ckan-pycsw:latest |
175 MB | Dev & Test latest version. |
| mjanez/ckan-pycsw | custom image | mjanez/ckan-pycsw:main |
175 MB | Stable version. |
Note
GHCR and Dev Dockerfiles using main images as base.
| Ports | Container |
|---|---|
| 0.0.0.0:8000->8000/tcp | pycsw |
| 0.0.0.0:5678->5678/tcp | ckan-pycsw debug (debugpy) |
Footnotes
-
Extends the @frafra coat2pycsw package. ↩
-
A custom installation of Docker Compose with specific extensions for spatial data and GeoDCAT-AP/INSPIRE metadata profiles. ↩
-
INSPIRE dataset and service metadata based on ISO/TS 19139:2007. ↩
-
The output pycsw schema (
iso19139_inspire), to comply with INSPIRE ISO 19139 is WIP. The validation of the dataset/series is complete and conforms to the INSPIRE reference validator datasets and dataset series (Conformance Class 1, 2, 2b and 2c). In contrast, spatial data services still fail in only 1 dimension [WIP]. ↩