This project builds a reproducible workflow for mapping grassland production intensity (GPI) from Sentinel-2 imagery and field-calibrated environmental measures. The model is calibrated with 2025 field surveys, where sampled zones were assigned observer-based management classes and measured for soil resistance, soil moisture, vegetation height, and plant richness. These field measurements are used to derive a three-class training dataset using KNN. A random forest classifier is then trained on pixels extracted from the labeled habitat polygons and applied to the full raster stack to produce a pixel-level GPI map.
Once calibrated, the saved model can be applied to other years without collecting new field data, provided that matching Sentinel-2 predictor rasters are available for the target year.
- Check
config.Rfor the calibration year, prediction year, image dates, class order, and expected predictor bands. - To rebuild the calibration model and make the configured prediction map, run:
source("scripts/run_calibration_and_prediction.R")- To apply the existing calibration model to a new prediction year, update
prediction_yearandprediction_image_date, add those rasters, then run:
source("scripts/run_prediction_only.R")The project uses two spatial units:
zone: sampled training polygons identified bypolygon_idfield: the reporting layer used to summarize the final pixel map
Sampled zones carry field measurements and observer labels. Those labels are converted into a weighted KNN-derived GPI target. The labeled zones are then used as training masks, so all predictor pixels inside each zone inherit the zone label. The trained random forest is applied pixel by pixel to the prediction-year raster stack, and the resulting raster is summarized back to fields for polygon-based reporting.
| Step | Script | Role | Main outputs |
|---|---|---|---|
| 00 | scripts/00_export_remote_sensing_from_gee.js |
Documents the raster export used by the R workflow. Run only when rasters need to be rebuilt. | <band>_<date>_mosaic.tif |
| 01 | scripts/01_build_anchor_training_data.R |
Joins sampled-zone geometry, field observations, plant richness, and calibration raster summaries into the anchor training table. | anchor_zone_training_data_<calibration_year>.csv |
| 02 | scripts/02_validate_environmental_relationships.R |
Checks whether s2rep has interpretable relationships with measured ecological variables. |
validation summary CSV and figure |
| 03 | scripts/03_define_candidate_gpi_classes.R |
Defines the supervised target with leave-one-out KNN over standardized field variables. | candidate_gpi_training_data_<calibration_year>.csv, KNN summary, boxplot |
| 04 | scripts/04_train_gpi_classifier.R |
Extracts labeled pixels from sampled zones, tunes and fits the random forest with grouped cross-validation, and saves diagnostics plus the selected model. | model RDS and validation tables |
| 05 | scripts/05_build_prediction_stack.R |
Builds the prediction-year predictor raster stack used directly by the pixel classifier. | pixel_predictor_stack_<prediction_year>.tif |
| 06 | scripts/06_predict_pixel_map.R |
Applies the saved calibration model pixel by pixel, summarizes the classified raster back to fields, and writes the final map products. | pixel_gpi_map_<prediction_year>.tif, field_gpi_pixel_summary_<prediction_year>.csv, field_gpi_map_<prediction_year>.gpkg, PNG previews |
GPI_Project/
├── config.R
├── README.md
├── data/
│ ├── raw/ # sampled field inputs
│ ├── spatial/ # sampled-zone and full-field geometry
│ └── processed/
│ ├── models/ # fitted random forest model
│ ├── predictions/ # prediction stacks and field summaries
│ ├── rasters/ # Sentinel-2 predictor stack
│ ├── spatial/ # pixel maps and field geopackages
│ ├── training/ # anchor and candidate training tables
│ └── validation/ # diagnostics and model evaluation tables
├── figures/ # validation and map preview figures
└── scripts/
data/raw/environmental_field_data_<calibration_year>.csv
Contains sampled-zone field measurements such as soil moisture, soil resistance, vegetation height, and the observer label.
data/raw/plant_diversity_plots_<calibration_year>.csv
Contains plot-level plant richness observations that are summarized to polygon_id.
data/spatial/sampled_zone_geometry.gpkg
Contains the sampled training polygons used for raster extraction and joins to field observations.
data/spatial/field_geometry.gpkg
Contains the geometry of all fields in the study area.
data/processed/rasters/
Stores the predictor stack expected by calibration and prediction:
s2rep_<date>_mosaic.tifndvi_<date>_mosaic.tifndwi_<date>_mosaic.tifsavi_<date>_mosaic.tifevi_<date>_mosaic.tifmsi_<date>_mosaic.tifndmi_<date>_mosaic.tifmndwi_<date>_mosaic.tif
The calibration rasters use calibration_image_date. Annual maps use
prediction_image_date. Keep the band names and index formulas consistent
between calibration and prediction years.
Training outputs:
data/processed/training/anchor_zone_training_data_<calibration_year>.csvdata/processed/training/candidate_gpi_training_data_<calibration_year>.csv
Validation outputs:
- environmental validation summary and figure
- KNN method summary and class counts
- random forest tuning, confusion matrix, class accuracy, variable importance, model comparison, and selected-model metadata
Model output:
data/processed/models/gpi_best_model_<calibration_year>.rds
Prediction and map outputs:
data/processed/predictions/pixel_predictor_stack_<prediction_year>.tifdata/processed/predictions/field_gpi_pixel_summary_<prediction_year>.csvdata/processed/spatial/pixel_gpi_map_<prediction_year>.tifdata/processed/spatial/field_gpi_map_<prediction_year>.gpkgfigures/pixel_gpi_map_<prediction_year>.pngfigures/field_gpi_map_<prediction_year>.png
config.R centralizes the values that must stay consistent across scripts:
calibration_yearandcalibration_image_dateprediction_yearandprediction_image_date- expected predictor bands
- model predictor bands
- ordered three-class GPI levels
knn_k- canonical input and output paths
Update prediction_year and prediction_image_date for annual mapping. Update
predictor_bands only when the raster stack changes.
tidyversesfterraexactextractrjanitormgcvbroompatchworkcaretrandomForest