pygeoapi kubernetes implementation - step 3 - deploying OGC process service
Table of Contents
Introduction
pygeoapi kubernetes series introduction
Deploying OGC process service
In this new chapter, we will develop and deploy two OGC process services.
Reference documentation:
- https://medium.com/geobeyond/create-ogc-processes-in-pygeoapi-11c0f7d3be61
- https://docs.pygeoapi.io/en/latest/publishing/ogcapi-processes.html#publishing-processes-via-ogc-api-processes
- https://docs.pygeoapi.io/en/0.12.0/data-publishing/ogcapi-processes.html
- https://dive.pygeoapi.io/publishing/ogcapi-processes/
Basic Concept
Pygeoapi is built on a plugin-oriented architecture that makes it possible to integrate existing Python code and expose it as a Process service. The code can be a standalone Python package; it only needs to provide one or more “interfaces” implementing the BaseProcessor class from pygeoapi. This is what we will develop in the remainder of this chapter.
Creation of the python package
The complete source code is available in this GitHub repository.
Here, we will use the example of two services:
- validate the format of a GeoJSON file
- validate the geometries of a GeoJSON file
This package relies on pydantic-geojson and shapely.
Initial code
The initial code is a standard Python package:
The setup.py file defines the package installation process and its dependencies.
setup.py
from setuptools import find_packages, setup
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
with open("requirements.txt", "r", encoding="utf-8") as fh:
requirements = fh.read().splitlines()
setup(
name='GeodataValidator',
version='0.1.0',
author='OpenGeoShift',
author_email='contact@opengeoshift.com',
license='UNLICENSED',
description='GeodataValidator is a Python package to validate geospatial data',
long_description=long_description,
long_description_content_type="text/markdown",
packages=find_packages(),
test_suite='tests',
python_requires='>=3.6',
install_requires=requirements,
tests_require=['pytest'],
classifiers=[
'Development Status :: 3 - Alpha',
'Intended Audience :: Developers',
'Topic :: Software Development :: Libraries :: Python Modules',
'License :: OSI Approved :: MIT License',
'Programming Language :: Python :: 3.6',
'Operating System :: OS Independent',
]
)
The main.py file contains the two main functions that power the two services. These functions rely on the common.geojson_utils module.
main.py
import logging
import time
from GeodataValidator.common.geojson_utils import GeoJsonUtils
logging.basicConfig(level=logging.INFO)
LOGGER = logging.getLogger(__name__)
gjutils = GeoJsonUtils()
def validate_geojson_format(geojson: dict)->bool:
"""
Validate geojson structure
"""
LOGGER.info('Validating GeoJSON Format...')
return gjutils.geojson_isvalid(geojson)
def validate_geojson_geometry(geojson: dict)->bool:
"""
Validate geojson geometry
"""
LOGGER.info('Validating GeoJSON Geometry...')
return gjutils.validate_geojson_geometry(geojson)
if __name__ == "__main__":
geojson = {"features":[{"geometry":{"coordinates":[[[9.4001,4.1678],[9.4001,4.1562],[9.4117,4.1562],[9.4117,4.1677],[9.4001,4.1678]]],"type":"Polygon"},"properties":{},"type":"Feature"}],"type":"FeatureCollection"}
geojson_validation_result = validate_geojson_format(geojson)
if not geojson_validation_result:
LOGGER.error("☒ Invalid GeoJSON Format")
LOGGER.info("☑ Valid Geojson Format")
geometry_validation_result = validate_geojson_geometry(geojson)
if not geometry_validation_result:
LOGGER.error("☒ Invalid GeoJSON Geometry")
LOGGER.info("☑ Valid Geojson Geometry")
common.geojson_utils
import logging
from pydantic_geojson import FeatureCollectionModel
from pydantic import ValidationError
from shapely.geometry import shape
from shapely.validation import explain_validity
LOGGER = logging.getLogger(__name__)
class GeoJsonUtils:
def geojson_isvalid(self, geojson: dict) -> bool:
try:
FeatureCollectionModel(**geojson)
return True
except ValidationError as e:
LOGGER.error(e)
return False
def validate_geojson_geometry(self, geojson: dict) -> bool:
all_valid = True
for idx, feature in enumerate(geojson.get("features", [])):
geom = feature.get("geometry")
if geom is None:
LOGGER.warning(f"Feature {idx}: missing geometry")
all_valid = False
continue
shapely_geom = shape(geom)
if not shapely_geom.is_valid:
LOGGER.error(f"Feature {idx}: invalid geometry")
LOGGER.error(f" Reason: {explain_validity(shapely_geom)}")
all_valid = False
return all_valid
Pygeoapi interface
To allow pygeoapi to execute the validate_geojson_format and validate_geojson_geometry functions, we need to create an interface that subclasses the pygeoapi BaseProcessor and calls the package’s code.:
pygeoapi_process_interface.geojson_format_validation.py
from pygeoapi.process.base import BaseProcessor, ProcessorExecuteError
from GeodataValidator import main
#: Process metadata and description
PROCESS_METADATA = {
'version': '0.2.0',
'id': 'geojson-format-validation',
'title': {
'en': 'geojson-format-validation',
'fr': 'geojson-format-validation'
},
'description': {
'en': 'Validate geojson format',
'fr': 'Validation format geojson',
},
'jobControlOptions': ['sync-execute', 'async-execute'],
'keywords': ['geojson', 'format', 'validation'],
'links': [{
'type': 'text/html',
'rel': 'about',
'title': 'information',
'href': 'https://example.org/process',
'hreflang': 'en-US'
}],
'inputs': {
'geojson': {
'title': 'Geojson',
'description': 'Geojson',
'schema': {
'type': 'object',
'contentMediaType': 'application/json'
},
'minOccurs': 1,
'maxOccurs': 1,
'keywords': ['geojson']
}
},
'outputs': {
'is_valid': {
'title': 'Is geojson format valid',
'description': 'Is the geojson format provided valid',
'schema': {
'type': 'boolean'
}
}
},
'example': {
'inputs': {
'geojson': {"type":"FeatureCollection","features":[{"type":"Feature","properties":{},"geometry":{"coordinates":[[[9.4001,4.1678],[9.4001,4.1562],[9.4117,4.1562],[9.4117,4.1677],[9.4001,4.1678]]],"type":"Polygon"}}]},
}
}
}
class GeoJsonFormatValidatorProcessor(BaseProcessor):
"""Processor example"""
def __init__(self, processor_def):
"""
Initialize object
:param processor_def: provider definition
:returns: pygeoapi.process.geojson_format_validation.GeoJsonFormatValidatorProcessor
"""
super().__init__(processor_def, PROCESS_METADATA)
self.supports_outputs = True
def execute(self, data, outputs=None):
mimetype = 'application/json'
geojson = data.get('geojson')
if geojson is None:
raise ProcessorExecuteError('Cannot process without a geojson')
try:
is_valid = main.validate_geojson_format(geojson)
outputs = {"is_valid": is_valid}
except Exception:
raise
return mimetype, outputs
def __repr__(self):
return f'<GeoJsonFormatValidatorProcessor> {self.name}'
pygeoapi_process_interface.geojson_geometry_validation.py
from pygeoapi.process.base import BaseProcessor, ProcessorExecuteError
from GeodataValidator import main
#: Process metadata and description
PROCESS_METADATA = {
'version': '0.2.0',
'id': 'geojson-geometry-validation',
'title': {
'en': 'geojson-geometry-validation',
'fr': 'geojson-geometry-validation'
},
'description': {
'en': 'Validate geojson geometries',
'fr': 'Validation geojson geometries',
},
'jobControlOptions': ['sync-execute', 'async-execute'],
'keywords': ['geojson', 'geometry', 'validation'],
'links': [{
'type': 'text/html',
'rel': 'about',
'title': 'information',
'href': 'https://example.org/process',
'hreflang': 'en-US'
}],
'inputs': {
'geojson': {
'title': 'Geojson',
'description': 'Geojson',
'schema': {
'type': 'object',
'contentMediaType': 'application/json'
},
'minOccurs': 1,
'maxOccurs': 1,
'keywords': ['geojson']
}
},
'outputs': {
'is_valid': {
'title': 'Is geojson geometry valid',
'description': 'Is the geojson geometry provided valid',
'schema': {
'type': 'boolean'
}
}
},
'example': {
'inputs': {
'geojson': {"type":"FeatureCollection","features":[{"type":"Feature","properties":{},"geometry":{"coordinates":[[[9.4001,4.1678],[9.4001,4.1562],[9.4117,4.1562],[9.4117,4.1677],[9.4001,4.1678]]],"type":"Polygon"}}]},
}
}
}
class GeoJsonGeometryValidatorProcessor(BaseProcessor):
"""Processor example"""
def __init__(self, processor_def):
"""
Initialize object
:param processor_def: provider definition
:returns: pygeoapi.process.geojson_geometry_validation.GeoJsonGeometryValidatorProcessor
"""
super().__init__(processor_def, PROCESS_METADATA)
self.supports_outputs = True
def execute(self, data, outputs=None):
mimetype = 'application/json'
geojson = data.get('geojson')
if geojson is None:
raise ProcessorExecuteError('Cannot process without a geojson')
try:
is_valid = main.validate_geojson_geometry(geojson)
outputs = {"is_valid":is_valid}
except Exception:
raise
return mimetype, outputs
def __repr__(self):
return f'<GeoJsonGeometryValidatorProcessor> {self.name}'
Package installation and service deployment configuration
Package installation
To make the package available during the deployment of pygeoapi, the ideal approach would be to publish it to an artifact repository manager. However, since the goal of this demo is to understand pygeoapi’s plugin architecture, we will simply use pip’s ability to install a package directly from a Git repository.
The package will be installed by creating a Docker image derived from geopython/pygeoapi:latest, in which Git is added along with the command to install the package itself.
In the pygeoapi directory, add a Dockerfile:
Dockerfile
FROM geopython/pygeoapi:0.21.0
RUN apt-get update \
&& apt-get install -y --no-install-recommends git \
&& rm -rf /var/lib/apt/lists/*
ARG CACHEBUST=1
RUN echo $CACHEBUST \
&& python3 -m pip install --no-cache-dir git+https://github.com/OpenGeoShift/tuto_pygeoapi_ogc_processes
Building the image
Update the Makefile to include a build step.
Makefile
IMAGE_NAME = ogs-pygeoapi
TAG ?= 1.0.0
FULL_IMAGE = $(IMAGE_NAME):$(TAG)
MINIKUBE_DOCKER = eval $$(minikube docker-env)
build:
$(MINIKUBE_DOCKER) && docker build --build-arg CACHEBUST=$$(date +%s) -t $(FULL_IMAGE) -t $(IMAGE_NAME):latest ./pygeoapi
deploy:
kubectl apply -k .
clean:
kubectl delete -k .
In the Minikube context, MINIKUBE_DOCKER allows you to build the image using Minikube’s Docker daemon instead of the default local Docker.
CACHEBUST is just a dummy variable that is updated by the build command at build time to force docker to execute the installation of the git package everytime a new build is triggered. It disable docker caching logic specifically for the package installation step.
Then, modify the deployment.yaml file so that the container uses the new image.
apiVersion: apps/v1
kind: Deployment
metadata:
name: pygeoapi
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: pygeoapi
template:
metadata:
labels:
app: pygeoapi
spec:
containers:
- name: pygeoapi
image: ogs-pygeoapi:latest # <-- new image
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
volumeMounts:
- name: config-volume
mountPath: /pygeoapi/local.config.yml
subPath: local.config.yml
volumes:
- name: config-volume
configMap:
name: pygeoapi-config
Adding the plugin through the PyGeoAPI configuration file.
Edit local.config.yml file.
Add the following lines at the end of the file:
local.config.yml
# ....
geojson-format-validation:
type: process
processor:
name: GeodataValidator.pygeoapi_process_interface.geojson_format_validation.GeoJsonFormatValidatorProcessor
geojson-geometry-validation:
type: process
processor:
name: GeodataValidator.pygeoapi_process_interface.geojson_geometry_validation.GeoJsonGeometryValidatorProcessor
Deploying service
From a command prompt (with WSL enabled on Windows).
wsl -d ubuntu
Building the image
$ cd /path-to-tuto-folder/
$ make build
$ make build
eval $(minikube docker-env) && docker build -t ogs-pygeoapi:1.0.0 -t ogs-pygeoapi:latest ./pygeoapi
failed to fetch metadata: fork/exec /usr/local/lib/docker/cli-plugins/docker-buildx: no such file or directory
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
Install the buildx component to build images with BuildKit:
https://docs.docker.com/go/buildx/
Sending build context to Docker daemon 12.8kB
Step 1/2 : FROM geopython/pygeoapi:latest
---> 4b4e27875671
Step 2/2 : RUN apt-get update && apt-get install -y --no-install-recommends git && rm -rf /var/lib/apt/lists/* && python3 -m pip install git+https://github.com/OpenGeoShift/tuto_pygeoapi_ogc_processes
---> Using cache
---> d4057d561b87
Successfully built d4057d561b87
Successfully tagged ogs-pygeoapi:1.0.0
Successfully tagged ogs-pygeoapi:latest
Deploying
$ make clean # <-- delete existing service
$ make deploy
$ make deploy
kubectl apply -k .
configmap/pygeoapi-config-295hc5kh4g created
service/pygeoapi created
deployment.apps/pygeoapi created
ingress.networking.k8s.io/pygeoapi-ingress created
In the WSL context, reactivate the tunnel if needed:
$ minikube tunnel
Test service
Conclusion
Any Python package can be converted into an OGC Process service using PyGeoAPI’s plugin architecture. To do this, you just need to create “interfaces” that implement PyGeoAPI’s BaseProcessor class.