Use a conda environment for clean installation
$ conda create --name molseg python=3.8.0
$ conda activate molseg
$ conda install pip
$ python3 -m pip install -U pip
$ pip install -r requirements.txt
Data preparation
Ground truth images used for training are in RGB format. Image masks should be in Black and White format. they should be in identical names under imgs and masks folders.
All the images should be squares. Place them on a squared canvas if necessary. The model works well for images sizing under 600*600.
Mechanistic molecular ground truth data and image segmentation masks can be found on ZENODO.
Model Training
Run the training script or train.py.
$ sbatch scripts/train.sh
Save the best checkpoint to MODEL.pth
A pretrained checkpoint is saved to: checkpoint.pth in huggingface. If you want to use this checkpoint, simply -m checkpoint.pth after downloading it and put in your root directory.
Prediction
After training your model and saving it to MODEL.pth, you can easily test the output masks on your images via the CLI.
To predict a single image and save it:
$ python predict.py -i image.jpg -o output.jpg
To predict a multiple images and show them without saving them:
$ python predict.py -i image1.jpg image2.jpg --viz --no-save
For batch predictions, use scripts/predict.sh after setting up the bash environment. Feel free to change the input directory and the output directory for further accomodating your task.
To directly utilize the model for arrow removal, we provide the function predict_and_process.py here to directly use the default functions and checkpoints we had,
- Clone or download this repo and make sure you have
predict_and_postprocess.pyin your working directory. - Install any dependencies by
requirements.py - Prepare your input
- Create a folder (e.g.
test/) and drop in your images (.jpg,.png, etc.).
- Create a folder (e.g.
- Run the script
python predict_and_postprocess.py \ --model MODEL.pth \ --input test/
| Flag | Description | Default |
|---|---|---|
-m, --model |
Path to your trained .pth model file |
MODEL.pth |
-i, --input |
Required. Input image file or folder | — |
-o, --output |
Output directory for masks (output/<input_name>/ if you omit) |
output/<input>/ |
-n, --no-save |
Don’t save mask images | false |
-v, --viz |
Pop up each image+mask for visual inspection | false |
-t, --mask-threshold |
Binarization cutoff (0 – 1) | 0.5 |
-s, --scale |
Scale factor for resizing input | 0.5 |
--bilinear |
Use bilinear upsampling | true |
-c, --classes |
Number of classes | 1 |
Expected output
-
Mask images go to
output/<input_folder_name>/ ├─ img1_OUT.png ├─ img2_OUT.png └─ ... -
Post-processed images (white-painted, single-channel) go to
processed/<input_folder_name>/ ├─ img1.png ├─ img2.png └─ ...
We collected 296 reaction mechanism images from textbook: Named Reactions 4th edition (Li, 2009).
Each image is named with its reaction name. The images are processed with this model and parsed by RxnScribe (Qian, 2023).
it contains information such as predicted molecular identity, positions and reaction conditions.
Find the images and parsed dataset.
| Dess-Martin periodinane oxidation | Corresponding object masks |
|---|---|
![]() |
![]() |
This architecture is mainly used for noise removal in chemical reaction mechanism images. In order to remove the noise segmented out in the original image, use process.py for overlaying the image mask and the original image.
imgs_path = "ver_mech/"
masks_path = "mechrxn_arrowmask/"
processed_path = "mechrxn_processed/"imgs_path is the original image folder path; masks_path is the images masks obtain with U-Net; processed_path can be renamed for your own interest.

Note that the dataset includes errors still even though it performs better with preprocessing of arrow removals. This dataset does not aim to serve as a benchmark, but more of a centralized and unified collection of reaction that benefit future researches in both chemistry and computer vision.
- The original U-Net paper: U-Net: Convolutional Networks for Biomedical Image Segmentation
- The model took reference from Milesial/Pytorch-UNet
- Molecular and reaction information extraction is employed by models from thomas0809/Molscribe and thomas0809/RxnScribe

