This study utilizes different machine learning and deep learning models to provide a proof of concept that organismic information at a macroscopic length scale may be de-encrypted from a snapshot of only a few dozen cells.
- Do NOT modify the name of this repository. If you really need to rename the repository, remember to update the list of strings in
utils.py. - At least 150 GB of disk space is required.
- Do NOT run a script with a larger number before ensuring that all scripts with smaller numbers have been executed, i.e., always execute scripts in sequential order.
- Create the dataset on an SSD or using an OS like Linux with aggressive caching to ensure reasonable processing times.
- The entire project uses
globfor file searches. If you need to save images in the project folder, please use file formats other than.tifor.tiff.
- CPU: i9-13900KS
- GPU: RTX-4090 with 24GB VRAM
- RAM: 64 GB DDR5
- OS: Ubuntu 22.04
- Python Version: 3.10
The data is available from DOI:10.6019/S-BIAD1654
-
Follow the instructions here to install
Miniforge. -
Create a new environment and activate it:
mamba create -n AI_SEC python=3.10 conda activate AI_SEC
-
Install the required packages (Tested on 2024-06-03):
mamba install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia pip install scikit-image==0.22.0 scikit-learn==1.4.0 pip install -U colorama toml tomlkit matplotlib tqdm rich seaborn imagecodecs mamba install pandas pyimagej openjdk=8 imgaug=0.4.0 pip install albumentations==1.3.1 pip install grad-cam==1.4.8 pip install umap-learn==0.5.6 mamba install numpy=1.23.0 mamba install mkl==2024.0
-
Download the
Fiji (ImageJ)and udpate the absolute path todb_path_plan.toml(see thedb_path_plan.tomlsection below). -
Install the
Fiji (ImageJ)pluginFind Focused Slices-
Download the plugin file
Find_focused_slices.class -
Place it into the
pluginsfolder of yourFiji (ImageJ)directory:📂 Fiji.app/ ├── 📂 ... ├── 📂 plugins/ │ ├── 📂 ... │ ├── 📄 Find_focused_slices.class (◀️ Place the file here) │ └── 📄 ... └── ...
-
-
Download and unzip
our data, then organize it into the following structure:📂 AISEC.data/ ├── 📂 {Data}_Processed/ ├── 📂 {Dataset}_DL/ ├── 📂 {Dataset}_ML/ ├── 📂 {Model}_BFSeg/ ├── 📂 {Model}_Cellpose/ ├── 📂 {Model}_DL/ ├── 📂 {Results}_Advanced/ (◀️ manually create the empty folder) ├── 📂 {Results}_DL/ (◀️ manually create the empty folder) └── 📂 {Results}_ML/ (◀️ manually create the empty folder)Note: To avoid encountering a
FileNotFoundError, manually create the{Results}_Advanced/,{Results}_DL/, and{Results}_ML/directories after unzipping. -
Update the absolute path in
db_path_plan.toml(see thedb_path_plan.tomlsection below).
Tools/ : The scripts are for debugging. Please do not run them unless necessary; use at your own risk.
dev/ : The files are under development or already deprecated, so they may not work. Please do not run them unless necessary; use at your own risk.
modules/ : All the utility functions and classes.
script_data/ : Converts .lif files into commonly used image formats, splits into train/validation/test sets, applies K-Means, etc.
script_bfseg/ : ResNet18-UNet automated segmentation for obtaining the "trunk surface area". Inspired by usuyama/pytorch-unet.
script_ml/ : From Cellpose segmentation to simple features, machine learning results, and UMAP. Inspired by ccsyan/labeling-cells-using-slic.
script_dl/ : Everything related to deep learning, including CAM analysis.
script_adv/ : Advanced results, such as "critical SEC number" and "feature-subtracted images", etc.
** For detailed instructions on each part, please click on the corresponding directory (folder) listed in the section above.
The TOML file manages the linkage between data and code, allowing them to be stored separately. This setup enables you to place the our data anywhere on your system.
It is parsed as a directory within the program:
- The left side (key) represents the variable used in the code.
- The right side (value) represents the name of the corresponding
datadirectory.
root = "" # absolute path of the data directory on this computer
data_processed = "{Data}_Processed"
dataset_cropped_v3 = "{Dataset}_DL"
dataset_ml = "{Dataset}_ML"
model_bfseg = "{Model}_BFSeg"
model_cellpose = "{Model}_Cellpose"
model_history = "{Model}_DL"
model_prediction = "{Results}_DL"
result_ml = "{Results}_ML"
result_adv = "{Results}_Advanced"
fiji_local = "" # absolute path of "Fiji (ImageJ)" on this computer
- The
rootkey should be assigned to the absolute path of the data directory on your system. - The
fiji_localkey should be assigned to the absolute path ofFiji (ImageJ)on your system. - The names of the
datadirectories are adjustable. If you change them, remember to update the corresponding names in theTOMLfile.
Note: Key-value pairs that appear in the
TOMLfile but are not mentioned here do not affect your experience with this project.