Hi,
I’ve been exploring the coastline extraction pipeline and running the workflow to better understand how NDWI-based masks are currently used for label generation.
Given my experience working with Sentinel-2 and Cartosat-3 imagery, I noticed that label quality can be significantly affected by cloud cover, shadows, and sensor artifacts - especially in high-resolution datasets.
I also saw discussions around integrating PlanetLabs Usable Data Mask (UDM), which seems like a promising direction to improve label quality before model retraining.
Problem
Currently:
- NDWI-generated labels may include noisy or invalid regions
- these errors propagate into training data
- which can degrade model performance
Proposed Contribution
I’d like to explore improving label quality through structured UDM integration:
1. UDM Integration Strategy
- Apply UDM filtering before NDWI computation
- Compare with post-NDWI filtering
2. Label Quality Analysis
3. Optional Evaluation
4. Pipeline Integration
- Add a modular preprocessing step for UDM filtering
- Keep it compatible with existing workflow
Background
I have worked with multi-band satellite imagery (Sentinel and Cartosat), where handling cloud/shadow artifacts was critical for reliable segmentation.
I’m also planning to apply for GSoC 2026 with this organization and would love to start contributing early and align my work with ongoing priorities.
Request for Feedback
- Does this direction align with current priorities?
- Any preferred approach for integrating UDM?
- Suggested datasets or evaluation strategies?
I’d be happy to begin with a small implementation and iterate based on feedback.
Thanks!
Hi,
I’ve been exploring the coastline extraction pipeline and running the workflow to better understand how NDWI-based masks are currently used for label generation.
Given my experience working with Sentinel-2 and Cartosat-3 imagery, I noticed that label quality can be significantly affected by cloud cover, shadows, and sensor artifacts - especially in high-resolution datasets.
I also saw discussions around integrating PlanetLabs Usable Data Mask (UDM), which seems like a promising direction to improve label quality before model retraining.
Problem
Currently:
Proposed Contribution
I’d like to explore improving label quality through structured UDM integration:
1. UDM Integration Strategy
2. Label Quality Analysis
Visual comparison of:
Focus on noise reduction and boundary clarity
3. Optional Evaluation
Compare segmentation outputs trained on:
4. Pipeline Integration
Background
I have worked with multi-band satellite imagery (Sentinel and Cartosat), where handling cloud/shadow artifacts was critical for reliable segmentation.
I’m also planning to apply for GSoC 2026 with this organization and would love to start contributing early and align my work with ongoing priorities.
Request for Feedback
I’d be happy to begin with a small implementation and iterate based on feedback.
Thanks!