Skip to content

CTPLab/GILEA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code Repository

Official PyTorch implementation for the following manuscript:

GILEA: In silico phenome profiling and editing using GAN Inversion, Computers in Biology and Medicine 2024.
Jiqing Wu and Viktor H. Koelzer.

Modeling heterogeneous disease states by data-driven methods has great potential to advance biomedical research. However, a comprehensive analysis of phenotypic heterogeneity is often challenged by the complex nature of biomedical datasets and emerging imaging methodologies. Here, we propose a novel GAN Inversion-enabled Latent Eigenvalue Analysis (GILEA) framework and apply it to in silico phenome profiling and editing. We show the performance of GILEA using cellular imaging datasets stained with the multiplexed fluorescence Cell Painting protocol. The quantitative results of GILEA can be biologically supported by editing of the latent representations and simulation of dynamic phenotype transitions between physiological and pathological states. In conclusion, GILEA represents a new and broadly applicable approach to the quantitative and interpretable analysis of biomedical image data.


The overall model illustration of the proposed GILEA approach.

Demo

vero.mp4

The cellular transition of mock cells for VERO dataset

hrce.mp4

The cellular transition of mock cells for HRCE dataset

huvec.mp4

The cellular transition of healthy cells for HUVEC dataset

skin_nv.mp4

The skin lesion transition of nv samples for HAM10000 dateset

Prerequisites

This implementation is mainly dependent on two backends: Restyle and WILDS. This suggests that, if the dependencies for Restyle and Wilds are installed, then the environment of our code is correctly configured.

Alternatively, to create the conda environment for this repo please see GILEA.yml, where all the necessary packages required by this repo are included in the list.

Data Preparation

To run the experiments investigated in the paper,

  1. Download RxRx19a (or b) to a local folder via the following links

    Metadata: https://storage.googleapis.com/rxrx/RxRx19a/RxRx19a-metadata.zip

    Imagedata: https://storage.googleapis.com/rxrx/RxRx19a/RxRx19a-images.zip

  2. Download HAM10000 to a local folder via the following link

    https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T

Take HAM10000 and HRCE as examples, then we preprocess the datasets by running

    python Dataset/prep_ham10k.py --root /path/to/HAM10000
    python Dataset/prep_rxrx19.py --root /path/to/RxRx19 --rxrx_name rxrx19a --rxrx_cell HRCE

Next we convert the processed data to the StyleGAN2/3 required formats by running

    python Dataset/prep_style2.py   /path/to/processed_HRCE --out /path/to/style2_HRCE --size 128
    python Dataset/prep_style3.py  --source=/path/to/processed_HRCE --dest=/path/to/style3_HRCE_128x128.zip --resolution=128x128

Pre-training StyleGAN2 and StyleGAN3

Here we pre-train the StyleGAN2/3 with two widely-used repositories:

1. https://github.com/rosinality/stylegan2-pytorch.git
2. https://github.com/NVlabs/stylegan3.git

With changing the implementations w.r.t. the image channels, we make the codes compatible for both HAM10000 and RxRx19, please see the implementations of StyleGAN2 and StyleGAN3 for more details and modify the related files in both above repos accordingly.

Assume now the current work directory is ../stylegan2-pytorch/, then run the following commands to train StyleGAN2:

    python -m torch.distributed.launch --nproc_per_node=1 --master_port=012 train.py \
        --batch 8 --iter 800000 --size 128  --check_save /path/to/HRCE_style2_decoder  /path/to/style2_HRCE

    python -m torch.distributed.launch --nproc_per_node=1 --master_port=345 train.py \
        --batch 8 --iter 100000 --size 128  --check_save /path/to/HAM10000_style2_decoder /path/to/style2_HAM10000

Assume now the current work directory is ../stylegan3/, then run the following commands to train StyleGAN3:

    python train.py --outdir=/path/to/HRCE_style3_decoder --cfg=stylegan3-t --data=/path/to/style3_HRCE_128x128.zip --gpus=1 --batch=32 --gamma=0.5 --kimg=5000

    python train.py --outdir=/path/to/HAM10000_style3_decoder --cfg=stylegan3-t --data=/path/to/the/style3_HAM10000_128x128.zip --gpus=1 --batch=32 --gamma=0.5 --kimg=800

GAN inversion

After obtaining robust encoders (StyleGAN2/3 generators), we run the GAN inversion (autoencoder) with the following repositories

1. https://github.com/yuval-alaluf/restyle-encoder.git
2. https://github.com/yuval-alaluf/stylegan3-editing.git

With integrating the above changes (StyleGAN2 and StyleGAN3) w.r.t. image channels to the two repos resp., we lauch the GAN inversion training.

Assume the current work directory is ../restyle-encoder/, we run the following commands to train the StyleGAN2_pSp

    python scripts/train_restyle_psp.py \
        --dataset_type=ham10k \
        --encoder_type=ResNetBackboneEncoder \
        --exp_dir=/path/to/HAM10000_style2_recon/ \
        --max_steps=100000 \
        --workers=8 \
        --batch_size=8 \
        --test_batch_size=8 \
        --test_workers=8 \
        --val_interval=5000 \
        --save_interval=10000 \
        --start_from_latent_avg \
        --lpips_lambda=0 \
        --l2_lambda=10 \
        --moco_lambda=0.5 \
        --w_norm_lambda=0 \
        --input_nc=6 \
        --n_iters_per_batch=1 \
        --output_size=128 \
        --train_decoder=False \
        --stylegan_weights=/path/to/HAM10000_style2_decoder/100000.pt
    
    # Train the DNA channel
    python scripts/train_restyle_psp.py \
        --dataset_type=rxrx19a_VERO \
        --encoder_type=ResNetBackboneEncoder \
        --exp_dir=/path/to/VERO1_style2_recon/ \
        --max_steps=800000 \
        --workers=8 \
        --batch_size=8 \
        --test_batch_size=8 \
        --test_workers=8 \
        --val_interval=5000 \
        --save_interval=10000 \
        --image_interval=10000 \
        --start_from_latent_avg \
        --lpips_lambda=0 \
        --l2_lambda=50 \
        --moco_lambda=0.5 \
        --w_norm_lambda=0 \
        --input_nc=2 \
        --input_ch=1 \
        --n_iters_per_batch=1 \
        --output_size=128 \
        --train_decoder=False \
        --stylegan_weights=/path/to/VERO_style2_decoder/790000.pt

Latent Eigenvalue Analysis

With the above reconstruction models in hand, we run the analysis as follows:

    # Compute the necessary stats
    python -m main \
        --seed=0 \
        --task=stat \
        --decoder=style2 \
        --encoder=psp \
        --data_name=rxrx19a \
        --data_cell=VERO \
        --data_splt=strat \
        --n_eval=8 \
        --n_iter=800000 \
        --data_path=/path/to/RxRx19/ \
        --ckpt_path=/path/to/ckpt/ \
        --save_path=/path/to/save/ \
        --stat_dec --stat_res --stat_eig=scm --stat_top=10

    # Output the baseline plot
    python -m main \
        --seed=0 \
        --task=baseline \
        --decoder=style2 \
        --encoder=psp \
        --data_name=rxrx19a \
        --data_cell=VERO \
        --data_splt=strat \
        --n_eval=8 \
        --n_iter=800000 \
        --data_path=/path/to/RxRx19 \
        --ckpt_path=/path/to/ckpt/ \
        --save_path=/path/to/save/ \
        --stat_dec --stat_res --stat_eig=scm --stat_top=10

Acknowledgment

This repository is built upon Restyle-encoder and StyleGAN3-editing projects. We would like to thank all the authors contributing to those projects. We would also like to thank all the authors contributing to the HAM10000 and RxRx19 projects.

License

The copyright license of this repository is specified with the LICENSE-GILEA.

The copyright license of HAM10000 is specified with the License-CC-BY-NC-SA 4.0.

The copyright license of RxRx19 is specified with the License-CC-BY.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors