π Flood risk mapping pipeline powered by Copernicus DEM | Automated geospatial analysis with Docker & DVC
A containerized Python pipeline for flood exposure mapping using open Copernicus DEM data. Identifies flood-prone areas via water-level thresholding, generates vector polygons, calculates statistics, and integrates DVC for cloud-based data versioning.
- πΊοΈ Automated Flood Mapping β Process elevation data to identify flood-prone areas
- π Configurable Water Levels β Set custom thresholds for different scenarios
- ποΈ Coastline Buffering β Optional coastal zone filtering for accurate analysis
- π Statistical Reports β Generate area summaries and flood extent metrics
- π¨ Visualization β Create publication-ready flood maps
- π³ Docker Support β Reproducible containerized deployment
- βοΈ Cloud Storage β DVC integration with AWS S3 for data versioning
- βοΈ CI/CD Pipeline β Automated testing and deployment with GitHub Actions
- π YAML Configuration β Easy multi-region setup with config files
Demonstrate a basic flood modeling workflow: load elevation data, apply water-level thresholds, extract flooded regions as polygons, and compute summary statistics.
- DEM Provider: Copernicus DEM GLO-30 (30m resolution global elevation data)
- Raw DEM files are stored in
data/raw/
The flood mapping pipeline consists of the following steps:
- Load DEM β Read elevation raster data
- Thresholding β Identify cells where elevation β€ water level
- Flood Mask β Generate binary mask of flooded areas
- Polygonization β Convert raster mask to vector polygons
- Summary Stats β Calculate total flooded area (kmΒ²)
- Python 3.12+
- rasterio β Raster data I/O and processing
- GeoPandas β Vector geometry operations
- NumPy β Array manipulation
- PyYAML β Configuration management
- DVC β Data version control with S3 backend
- Docker β Containerized deployment
- uv β Fast Python package manager
- pytest β Testing framework
- GitHub Actions β CI/CD pipeline
# Clone the repository
git clone <repository-url>
cd deltares_floodriskmapping
# Install dependencies with uv
uv sync --extra dev
# Activate virtual environment
source .venv/bin/activateAll pipeline parameters are managed through YAML configuration files in the configs/ directory. Edit or create config files to customize:
- Config metadata (name, description)
- Input data paths (DEM file, coastline shapefile)
- Water level threshold (meters)
- Coastline buffer distance (meters, or null to disable)
- Output file names and formats
- Visualization settings (DPI, colormaps, figure size, output filenames)
Example configuration:
# Config identification
info:
name: "delft"
description: "Flood risk mapping for Delft"
# Data directories
data:
raw_dir: "data/raw"
inter_dir: "data/inter" # Intermediate files (e.g., coastline buffers)
processed_dir: "data/processed"
dem_file: "dem_delft.tif"
coastline_file: "ne_10m_coastline/ne_10m_coastline.shp"
# Pipeline parameters
pipeline:
water_level: 2.0 # meters above reference
coast_buffer_dist_m: 5000.0 # buffer distance in meters
metric_crs: 3857 # EPSG code for area calculations
# Output files
output:
flood_mask_raster: "flood_mask_delft.tif"
flood_polygons_vector: "flood_polygons_delft.gpkg"
summary_report: "flood_summary_delft.txt"
# Visualization
visualization:
flood_map_output: "flood_map_delft.png"
debug_layers_output: "debug_layers_delft.png"
dpi: 200
figsize: [8, 6]# Run with specific config
python -m src.pipeline -c configs/config_delft.yaml
# Run and track/push outputs to DVC S3 (default behavior)
python -m src.pipeline -c configs/config_delft.yaml
# Skip DVC tracking/pushing
python -m src.pipeline -c configs/config_delft.yaml --no-push-dataThis will:
- Load the DEM from the path specified in config
- Generate flood mask at the configured water level
- Apply coastline buffer (if enabled) and save to
data/inter/ - Save results to
data/processed/:- Flood mask raster (
.tif) - Flood polygons vector (
.gpkg) - Summary repor -c configs/config_delft.yaml
- Flood mask raster (
With a custom configuration file:
```bash
python -m src.viz --config configs/my_config.yaml
Generates flood visualization maps using settings from the config file. Output filenames are specified in the config's visualization section
python -m src.viz --config configs/my_config.yamlGenerates flood visualization maps using settings from the config file.
pytest tests/Both scripts support command-line arguments:
# Show help
python -m src.pipeline --help
python -m src.viz --help
# Use custom config
python -m src.pipeline -c configs/config_delft.yaml
python -m src.viz -c configs/config_nice.yamlManaging Multiple Configurations:
- Store different configs in the
configs/directory - Each config can have unique settings for different regions or scenarios
- Output files are automatically named based on the
info.namefield - This prevents output conflicts when running multiple configurations
Large data files (DEMs, flood outputs) are managed with DVC (Data Version Control) and stored on AWS S3. This keeps the Git repository lightweight while enabling version control and team collaboration for datasets.
# Pull data from S3 (first time or to sync)
dvc pull
# Run pipeline (automatically tracks and pushes results to S3)
python -m src.pipeline -c configs/config_delft.yaml
# Run without DVC tracking/pushing
python -m src.pipeline -c configs/config_delft.yaml --no-push-data
# Manually track and push new files
dvc add data/processed/new_output.tif
dvc push
git add data/processed/new_output.tif.dvc
git commit -m "Add new output"-
Install DVC with S3 support:
uv sync --extra s3
-
Configure AWS credentials:
Create a
.envfile in the project root:cp .env.example .env
Edit
.envwith your credentials:AWS_ACCESS_KEY_ID=your_access_key AWS_SECRET_ACCESS_KEY=your_secret_key AWS_DEFAULT_REGION=eu-north-1
Or use AWS CLI:
aws configure
-
Pull data from S3:
dvc pull
For detailed setup, costs, troubleshooting, and best practices, see docs/cloud.md.
Run the pipeline in a container without installing dependencies locally:
# Build image
docker compose build floodmap
# Run with default config
docker compose up floodmap
# Run with different config
docker compose run --rm floodmap python -m src.pipeline -c /app/configs/config_nice.yaml
# Skip DVC tracking/pushing
docker compose run --rm floodmap python -m src.pipeline -c /app/configs/config_delft.yaml --no-push-dataPrerequisites for DVC/S3 support:
- Create
.envfile with AWS credentials (see Cloud Data Storage section) - DVC and Git directories are automatically mounted
For detailed Docker setup, deployment options, and troubleshooting, see docs/docker.md.
.
βββ configs/ # Configuration files for different regions
β βββ config_delft.yaml
β βββ config_nice.yaml
βββ data/
β βββ raw/ # Input DEM files and coastline shapefiles
β βββ inter/ # Intermediate files (coastline buffer masks)
β βββ processed/ # Output flood masks, polygons, and visualizations
βββ notebooks/ # Jupyter notebooks for exploration
βββ src/
β βββ config.py # Configuration loader
β βββ pipeline.py # Main flood mapping pipeline
β βββ load_data.py # Data loading utilities
β βββ coastline_buffer.py # Coastline processing
β βββ viz.py # Visualization scripts
βββ tests/ # Unit tests
βββ pyproject.toml # Project dependencies
- π Cloud Storage Setup (AWS S3/DVC)
- π³ Docker Deployment Guide
- βοΈ CI/CD Pipeline
Contributions are welcome! Please feel free to submit a Pull Request.
This is a toy project for educational purposes.
Built with π for geospatial flood risk analysis