EarthBridge: A Solution for the 4th Multi-modal Aerial View Image Challenge (MAVIC-T) — Translation Track
Chen, Zhenyuan, Guanyuan Shen, and Feng Zhang. "EarthBridge: A Solution for 4th Multi-Modal Aerial View Image Challenge Translation Track." Accepted to the 22nd IEEE CVPR Workshop on Perception Beyond the Visible Spectrum (PBVS 2026). Preprint: arXiv, March 6, 2026. https://doi.org/10.48550/arXiv.2603.06753
This repository is a preview release of the EarthBridge codebase, containing the DBIM and CUT baselines used in our competition solution, along with their related training, inference, and evaluation code.
This preview release includes only the flagship experiment configurations where each method achieved its best results among our experiments. Due to time limitations, comprehensive baselines, hyperparameter tuning, and scaling studies are not included — these are left for community exploration.
| Task | Flagship Method | Script |
|---|---|---|
| sar2eo (SAR → EO) | DBIM | scripts/DBIM_Pixel_Medium-0216/train_sar2eo.sh |
| sar2rgb (SAR → RGB) | DBIM | scripts/DBIM_Pixel_Medium-0216/train_sar2rgb.sh |
| rgb2ir (RGB → IR) | DBIM | scripts/DBIM_Pixel_Medium-0216/train_rgb2ir.sh |
| sar2ir (SAR → IR) | CUT | scripts/CUT_Scaled-0218/train_sar2ir.sh |
| Baseline | Reference | Description |
|---|---|---|
| DBIM | ICLR 2025 | Diffusion Bridge Implicit Models |
| CUT | ECCV 2020 | Contrastive Unpaired Translation |
- Install from
requirements.txt(recommended)
conda create -n rsgen python=3.12
conda activate rsgen
# we are using PyTorch 2.8.0 torchaudio 2.8.0 torchvision 0.23.0 from https://download.pytorch.org/whl/cu126
# other version mostly would work as long installed follow https://pytorch.org/get-started/previous-versions/
pip install torch==2.8.0+cu126 torchaudio==2.8.0+cu126 torchvision==0.23.0+cu126 --index-url https://download.pytorch.org/whl/cu126
# install other packages
pip install -r requirements.txt
pip install swanlab- Install from
environment.yaml
conda env create -f environment.yaml
conda activate rsgenIf you clone the repo to a custom location, set PROJECT_ROOT to your project directory. Scripts will then resolve paths relative to it.
# Option 1: Source paths.env (auto-detects project root from file location)
source paths.env
# Option 2: Set manually before running scripts
export PROJECT_ROOT=/path/to/EarthBridge-Preview| Directory | Purpose |
|---|---|
datasets/ |
BiliSakura/MACIV-T-2025-Structure-Refined: manifests/, {task}/train/{input,target}/, val/{task}/input/, test/{task}/. See docs/dataset.md. |
models/ |
Pre-trained model weights. |
src/models/ |
Model implementations: unet_dbim, cut_model. |
examples/ |
Trainer and sample scripts for dbim, cut. |
scripts/ |
Flagship training launchers for DBIM and CUT experiments. |
ckpt/ |
Checkpoints and SwanLab logs from training runs. |
Some scripts use pre-trained MaRS encoders for representation alignment or validation-set creation. Please pre-download them from HuggingFace/BiliSakura to your local models/ folder:
| Model | HuggingFace ID | Local path |
|---|---|---|
| MaRS-Base-RGB | BiliSakura/MaRS-Base-RGB |
models/BiliSakura/MaRS-Base-RGB |
| MaRS-Base-SAR | BiliSakura/MaRS-Base-SAR |
models/BiliSakura/MaRS-Base-SAR |
# From project root
mkdir -p models/BiliSakura
huggingface-cli download BiliSakura/MaRS-Base-RGB --local-dir models/BiliSakura/MaRS-Base-RGB
huggingface-cli download BiliSakura/MaRS-Base-SAR --local-dir models/BiliSakura/MaRS-Base-SARTraining scripts support SwanLab for experiment tracking. Install with pip install swanlab.
SwanLab logs
Public resources
Enable SwanLab — Add --log_with swanlab to any training command:
--log_with swanlabLog location — SwanLab logs are stored under ./ckpt/swanlog.
See docs/quick_start.md for detailed training, inference, and evaluation instructions.