End-to-End Multi-Modal Image Registration CNN

Project Introduction

This project can be used to create pixel-level aligned multi-modal image pairs for training multi-modal models.

The project utilizes an end-to-end multi-modal image registration CNN, employing a Siamese convolutional network to estimate similarity transformation parameters between two modalities (e.g., infrared and visible light) — rotation angle θ, scaling factor s, and translation vector (Δx, Δy), thereby achieving image alignment.

Model Architecture

The model adopts a Siamese network architecture with shared weights. The specific structure is shown in the figure below:

Usage

Train the Model

python train.py --vis_dir PATH_TO_VIS --ir_dir PATH_TO_IR --json_path PATH_TO_JSON --model_dir MODEL_SAVE_DIR

Main parameters:

--vis_dir: Directory of visible light images
--ir_dir: Directory of infrared images
--json_path: Path to the transformation parameters JSON file
--model_dir: Directory to save the model
--train_percentage: Percentage of data used for training
--batch_size: Batch size
--num_epochs: Number of training epochs
--learning_rate: Learning rate
--resume: Resume training from a checkpoint (optional)

You can also omit parameters to use the default values defined in parse_args().

Model Inference

python inference.py --vis_dir PATH_TO_VIS --ir_dir PATH_TO_IR --model_path PATH_TO_MODEL --output_dir OUTPUT_DIR

Main parameters:

--vis_dir: Directory of visible light images
--ir_dir: Directory of infrared images
--model_path: Path to the trained model
--output_dir: Directory for output results
--image_id: Specific image ID to process (if not specified, all images will be processed)
--fusion_mode: Fusion mode (average, weighted, false_color, layered)

You can also omit parameters to use the default values defined in parse_args().

Inference Results

After inference, the system will generate the following files for each processed image:

Original infrared image: ir_original.jpg
Aligned infrared image: ir_aligned.jpg
Visible light image: vis.jpg
Fused image: fused.jpg
Visualization result: fusion.jpg (includes all images and transformation parameters)
Parameter JSON file: params.json (includes predicted transformation parameters and ground truth)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
src		src
CNN.png		CNN.png
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End Multi-Modal Image Registration CNN

Project Introduction

Model Architecture

Usage

Train the Model

Model Inference

Inference Results

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

End-to-End Multi-Modal Image Registration CNN

Project Introduction

Model Architecture

Usage

Train the Model

Model Inference

Inference Results

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages