A customized CKAN data catalog for the Climate-ecological Observatory for Arctic Tundra (COAT) project, developed by NINA (Norwegian Institute for Nature Research).
This repository contains a complete, production-ready deployment of CKAN tailored for managing ecological and climate monitoring data from Arctic tundra research. The portal enables researchers to publish, discover, and share datasets with full DOI citation support and spatial search capabilities.
- ποΈ Multi-type Dataset Management - Supports datasets, state variables, protocols, and data management plans
- π Dataset Versioning - Track and search different versions of datasets
- π Spatial Search - Find datasets by geographic location and extent
- π DOI Integration - Automatic DOI minting via DataCite for published datasets
- π OAuth2 Authentication - Federated authentication via Dataporten
- π¦ Bulk Download - Download multiple resources as a single ZIP archive
- π CSW Support - PyCSW integration for OGC Catalogue Service for the Web
- π DCAT Metadata - Export COAT-compliant DCAT metadata for interoperability
flowchart TB
traefik["traefik"] --> ckan["CKAN"]
traefik --> pycsw["pycsw"]
traefik --> bulk["bulk-download"]
pycsw --> ckan
bulk --> ckan
ckan --> db[("PostgreSQL")]
ckan --> solr[("Solr")]
ckan --> redis[("Redis")]
ckan --> storage[("File Storage")]
| Extension | Purpose |
|---|---|
ckanext-coat |
Core COAT functionality (versioning, resource protection, naming) |
ckanext-coatcustom |
COAT-specific schemas, spatial search, DOI citations |
ckanext-scheming |
Customizable dataset schemas |
ckanext-spatial |
Spatial metadata and search |
ckanext-doi |
DOI minting and DataCite integration |
ckanext-oauth2 |
OAuth2 authentication (Dataporten) |
ckanext-datasetversions |
Dataset version management |
Operating System: GNU/Linux (tested on Debian-based distributions)
Dependencies:
- Docker 20.10+ with Compose V2
- Python 3.8+ (for scripts only)
- Git with submodule support
# Clone the repository with submodules
git clone --recursive https://gitlab.com/nina-data/nina-ckan-coat.git
cd nina-ckan-coat
# Or if already cloned, fetch submodules
git submodule update --init --recursive
# Copy and configure environment
cp template.env .env
# Edit .env - at minimum set DOI_* test variables
# Build and run in development mode
docker compose --profile dev build
docker compose --profile dev run --rm ckan-devThe development server supports:
- Hot reloading for CKAN extensions (via volume mounts)
- Interactive debugging with
pdb - DOI test mode enabled by default
- A fixed sysadmin user created automatically on startup when
CKAN_ADMIN_PASSWORDis set in.env:- Username:
adminPassword: value ofCKAN_ADMIN_PASSWORD(default:administrator)
- Username:
# Configure environment
cp template.env .env
# Edit .env and set:
# - DOI_PREFIX, DOI_ACCOUNT_NAME, DOI_ACCOUNT_PWD
# - CKAN_OAUTH2_* variables for authentication
# - CKAN_SITE_URL for your domain
# Build and start
docker compose --profile prod build
docker compose --profile prod up -d| Variable | Description | Required |
|---|---|---|
CKAN_SITE_URL |
Public URL of the CKAN instance | Yes |
CKAN_PORT |
Port to expose CKAN (default: 5000) | No |
POSTGRES_PASSWORD |
PostgreSQL admin password | Yes |
DOI_PREFIX |
DataCite DOI prefix | Yes (prod) |
DOI_ACCOUNT_NAME |
DataCite account name | Yes (prod) |
DOI_ACCOUNT_PWD |
DataCite account password | Yes (prod) |
DOI_TEST_MODE |
Enable DOI test mode | No (default: true) |
CKAN_OAUTH2_CLIENT_ID |
OAuth2 client ID | Yes (prod) |
CKAN_OAUTH2_CLIENT_SECRET |
OAuth2 client secret | Yes (prod) |
CKAN_MAX_UPLOAD_SIZE_MB |
Max upload size (default: 1000) | No |
There are two test suites, both run inside Docker against a live ckan-test instance:
| Suite | File | Description |
|---|---|---|
| API integration | tests/test_api.py |
CKAN API tests (package lifecycle, versioning, embargo, β¦) |
| E2E browser | tests/base.py |
SeleniumBase browser automation |
docker compose --profile test build
docker compose --profile test run --rm test-api
docker compose --profile test run --rm testNote:
docker compose --profile test down -vbetween runs clears the database for a clean state.
The ckan-test instance also creates the fixed sysadmin on startup (when CKAN_ADMIN_PASSWORD is set), so you can inspect test data in the web portal at http://127.0.0.1:5000 while tests are running.
# Create a sysadmin user
docker compose exec ckan ckan -c /etc/ckan/production.ini sysadmin add USERNAME
# Rebuild search index
docker compose exec ckan ckan -c /etc/ckan/production.ini search-index rebuild
# Access PostgreSQL
docker compose exec db psql -U ckan
# View CKAN logs
docker compose logs -f ckan
# Access Solr admin
# Open http://localhost:8983/solr/ in browserβββ ckanext/ # CKAN extensions (git submodules)
β βββ ckanext-coat/ # Core COAT extension
β βββ ckanext-coatcustom/ # COAT customizations and schemas
β βββ ckanext-*/ # Other extensions
βββ custom/ # Custom entrypoints and requirements
β βββ coat-entrypoint.sh # Production entrypoint
β βββ coat-entrypoint-dev.sh
βββ services/ # Supporting services
β βββ bulk-download/ # ZIP download service
β βββ pycsw/ # OGC CSW service
βββ scripts/ # Utility scripts
βββ tests/ # Integration tests
βββ docker-compose.yml # Service orchestration
βββ Dockerfile # CKAN image build
βββ template.env # Environment template
The CKAN extensions are mounted as volumes in development mode, allowing live code changes:
# Edit extension code
vim ckanext/ckanext-coat/ckanext/coat/plugin.py
# Changes are reflected immediately (may need page refresh)The CKAN ecosystem lacks a well-maintained OAuth2 extension. The history of our fork:
- conwetlab/ckanext-oauth2: the original extension (unmaintained)
- FedericOldani submitted a Flask/Python 3 conversion (conwetlab#42), never merged
- We (COATnor) forked conwetlab, cherry-picked FedericOldani's work, added Feide
/userinfoendpoint support and fixedsetup.pyfor out-of-tree builds - In-For-Disaster-Analytics forked COATnor and added CKAN 2.11 support
- We merged the changes back and improved
pyproject.toml
We are now collaborating on joint maintenance (issue #25) and submitted the dependency cleanup upstream (PR #26).
We use a COATnor fork with one extra commit on top of upstream v3.1.0: it skips schema validation when the __parent flag is set during dataset creation. This is needed because COAT's versioning system creates lightweight parent datasets that don't conform to the strict custom schemas. See ckan/ckanext-scheming#331 for the upstream discussion.
We maintain a COATnor fork that adds proper dependency declarations to pyproject.toml. Submitted upstream as ckan/ckanext-spatial#352.
aptivate/ckanext-datasetversions is mostly unmaintained (see issue #17). We maintain a NINAnor fork with the following non-upstreamed changes:
- Flask blueprint migration (from Pylons)
__parentparameter for parent dataset creation- Create parent datasets with the same type as the child
- Move versions list to a reusable snippet
- Support for private dataset versions
- CKAN 2.10 compatibility
These should be submitted as PRs to aptivate (we have collaborator access).
CKAN itself does not declare runtime dependencies in pyproject.toml (only setuptools in install_requires), requiring a dummy package workaround to feed its pinned requirements into the resolver. We commented on ckan/ckan#8382 proposing to add dependencies to pyproject.toml and use uv export for pinning.