Skip to content

virajnistane/FloodRiskMapping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

51 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Flood Risk Mapping

Python CI Docker DVC

🌊 Flood risk mapping pipeline powered by Copernicus DEM | Automated geospatial analysis with Docker & DVC

A containerized Python pipeline for flood exposure mapping using open Copernicus DEM data. Identifies flood-prone areas via water-level thresholding, generates vector polygons, calculates statistics, and integrates DVC for cloud-based data versioning.

✨ Features

  • πŸ—ΊοΈ Automated Flood Mapping – Process elevation data to identify flood-prone areas
  • 🌊 Configurable Water Levels – Set custom thresholds for different scenarios
  • πŸ–οΈ Coastline Buffering – Optional coastal zone filtering for accurate analysis
  • πŸ“Š Statistical Reports – Generate area summaries and flood extent metrics
  • 🎨 Visualization – Create publication-ready flood maps
  • 🐳 Docker Support – Reproducible containerized deployment
  • ☁️ Cloud Storage – DVC integration with AWS S3 for data versioning
  • βš™οΈ CI/CD Pipeline – Automated testing and deployment with GitHub Actions
  • πŸ“ YAML Configuration – Easy multi-region setup with config files

Goal

Demonstrate a basic flood modeling workflow: load elevation data, apply water-level thresholds, extract flooded regions as polygons, and compute summary statistics.

Data Sources

  • DEM Provider: Copernicus DEM GLO-30 (30m resolution global elevation data)
  • Raw DEM files are stored in data/raw/

Pipeline

The flood mapping pipeline consists of the following steps:

  1. Load DEM – Read elevation raster data
  2. Thresholding – Identify cells where elevation ≀ water level
  3. Flood Mask – Generate binary mask of flooded areas
  4. Polygonization – Convert raster mask to vector polygons
  5. Summary Stats – Calculate total flooded area (kmΒ²)

Technology Stack

  • Python 3.12+
  • rasterio – Raster data I/O and processing
  • GeoPandas – Vector geometry operations
  • NumPy – Array manipulation
  • PyYAML – Configuration management
  • DVC – Data version control with S3 backend
  • Docker – Containerized deployment
  • uv – Fast Python package manager
  • pytest – Testing framework
  • GitHub Actions – CI/CD pipeline

Installation

# Clone the repository
git clone <repository-url>
cd deltares_floodriskmapping

# Install dependencies with uv
uv sync --extra dev

# Activate virtual environment
source .venv/bin/activate

How to Run

Configuration

All pipeline parameters are managed through YAML configuration files in the configs/ directory. Edit or create config files to customize:

  • Config metadata (name, description)
  • Input data paths (DEM file, coastline shapefile)
  • Water level threshold (meters)
  • Coastline buffer distance (meters, or null to disable)
  • Output file names and formats
  • Visualization settings (DPI, colormaps, figure size, output filenames)

Example configuration:

# Config identification
info:
  name: "delft"
  description: "Flood risk mapping for Delft"

# Data directories
data:
  raw_dir: "data/raw"
  inter_dir: "data/inter"        # Intermediate files (e.g., coastline buffers)
  processed_dir: "data/processed"
  dem_file: "dem_delft.tif"
  coastline_file: "ne_10m_coastline/ne_10m_coastline.shp"

# Pipeline parameters
pipeline:
  water_level: 2.0  # meters above reference
  coast_buffer_dist_m: 5000.0  # buffer distance in meters
  metric_crs: 3857  # EPSG code for area calculations

# Output files
output:
  flood_mask_raster: "flood_mask_delft.tif"
  flood_polygons_vector: "flood_polygons_delft.gpkg"
  summary_report: "flood_summary_delft.txt"

# Visualization
visualization:
  flood_map_output: "flood_map_delft.png"
  debug_layers_output: "debug_layers_delft.png"
  dpi: 200
  figsize: [8, 6]

Run the flood mapping pipeline:

# Run with specific config
python -m src.pipeline -c configs/config_delft.yaml

# Run and track/push outputs to DVC S3 (default behavior)
python -m src.pipeline -c configs/config_delft.yaml

# Skip DVC tracking/pushing
python -m src.pipeline -c configs/config_delft.yaml --no-push-data

This will:

  • Load the DEM from the path specified in config
  • Generate flood mask at the configured water level
  • Apply coastline buffer (if enabled) and save to data/inter/
  • Save results to data/processed/:
    • Flood mask raster (.tif)
    • Flood polygons vector (.gpkg)
    • Summary repor -c configs/config_delft.yaml

With a custom configuration file:
```bash
python -m src.viz --config configs/my_config.yaml

Generates flood visualization maps using settings from the config file. Output filenames are specified in the config's visualization section

python -m src.viz --config configs/my_config.yaml

Generates flood visualization maps using settings from the config file.

Run tests:

pytest tests/

CLI Options

Both scripts support command-line arguments:

# Show help
python -m src.pipeline --help
python -m src.viz --help

# Use custom config
python -m src.pipeline -c configs/config_delft.yaml
python -m src.viz -c configs/config_nice.yaml

Managing Multiple Configurations:

  • Store different configs in the configs/ directory
  • Each config can have unique settings for different regions or scenarios
  • Output files are automatically named based on the info.name field
  • This prevents output conflicts when running multiple configurations

Cloud Data Storage

Large data files (DEMs, flood outputs) are managed with DVC (Data Version Control) and stored on AWS S3. This keeps the Git repository lightweight while enabling version control and team collaboration for datasets.

Quick Start with DVC

# Pull data from S3 (first time or to sync)
dvc pull

# Run pipeline (automatically tracks and pushes results to S3)
python -m src.pipeline -c configs/config_delft.yaml

# Run without DVC tracking/pushing
python -m src.pipeline -c configs/config_delft.yaml --no-push-data

# Manually track and push new files
dvc add data/processed/new_output.tif
dvc push
git add data/processed/new_output.tif.dvc
git commit -m "Add new output"

Setup (One-Time)

  1. Install DVC with S3 support:

    uv sync --extra s3
  2. Configure AWS credentials:

    Create a .env file in the project root:

    cp .env.example .env

    Edit .env with your credentials:

    AWS_ACCESS_KEY_ID=your_access_key
    AWS_SECRET_ACCESS_KEY=your_secret_key
    AWS_DEFAULT_REGION=eu-north-1

    Or use AWS CLI:

    aws configure
  3. Pull data from S3:

    dvc pull

For detailed setup, costs, troubleshooting, and best practices, see docs/cloud.md.

Docker Deployment

Run the pipeline in a container without installing dependencies locally:

# Build image
docker compose build floodmap

# Run with default config
docker compose up floodmap

# Run with different config
docker compose run --rm floodmap python -m src.pipeline -c /app/configs/config_nice.yaml

# Skip DVC tracking/pushing
docker compose run --rm floodmap python -m src.pipeline -c /app/configs/config_delft.yaml --no-push-data

Prerequisites for DVC/S3 support:

  1. Create .env file with AWS credentials (see Cloud Data Storage section)
  2. DVC and Git directories are automatically mounted

For detailed Docker setup, deployment options, and troubleshooting, see docs/docker.md.

Project Structure

.
β”œβ”€β”€ configs/              # Configuration files for different regions
β”‚   β”œβ”€β”€ config_delft.yaml
β”‚   └── config_nice.yaml
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/              # Input DEM files and coastline shapefiles
β”‚   β”œβ”€β”€ inter/            # Intermediate files (coastline buffer masks)
β”‚   └── processed/        # Output flood masks, polygons, and visualizations
β”œβ”€β”€ notebooks/            # Jupyter notebooks for exploration
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ config.py         # Configuration loader
β”‚   β”œβ”€β”€ pipeline.py       # Main flood mapping pipeline
β”‚   β”œβ”€β”€ load_data.py      # Data loading utilities
β”‚   β”œβ”€β”€ coastline_buffer.py  # Coastline processing
β”‚   └── viz.py            # Visualization scripts
β”œβ”€β”€ tests/                # Unit tests
└── pyproject.toml        # Project dependencies

Documentation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This is a toy project for educational purposes.


Built with 🌊 for geospatial flood risk analysis

About

Geospatial flood risk mapping pipeline using DEM data. Identifies flood-prone areas, generates masks & polygons, with DVC/S3 storage and Docker deployment.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors