Automated duckweed segmentation, tracking, and growth analysis pipeline using deep learning. This tool processes time-lapse microscopy data from petri dish and microfluidics experiments to segment individual duckweed fronds, track them across frames, infer lineage (budding events), and generate publication-ready figures.
Developed at Oliveira Lab
- Overview
- Repository Structure
- Requirements
- Installation
- Data Setup
- Usage
- Pipeline Overview
- Outputs
- Troubleshooting
Deepweed provides an end-to-end computer vision pipeline:
- Segmentation - A U-Net deep learning model identifies individual duckweed fronds in each frame.
- Tracking - Bayesian tracking (btrack) assigns persistent IDs across frames.
- Track Stitching — Fragmented tracks from the same plant are merged.
- Lineage Inference - Parent-child (budding) relationships are detected.
- Cluster Analysis - Fronds are grouped into spatial clusters (e.g., individual petri dishes or microfluidic wells).
- Visualization - Publication-ready figures: population dynamics, growth curves, lineage timelines, trajectory plots, and animated GIFs.
Two dataset modes are supported:
| Feature | Petri Dish | Microfluidics |
|---|---|---|
| Input | JPEG/PNG image sequence | AVI video |
| Model | 3-class boundary U-Net (bg/body/boundary) | 1-class binary U-Net |
| Clustering | KMeans (6 known dishes) | Distance-based hierarchical |
| Time resolution | 5 min/frame | 20 min/frame |
Deepweed/
├── README.md
├── paper_analysis/
│ ├── Duckweed_Paper_Analysis.ipynb # Main analysis notebook (run this)
│ ├── unet_model_class.py # U-Net model architecture (PyTorch)
│ ├── microfluidics_segmentation_fixed.py # Standalone segmentation module
│ ├── cell_config.json # btrack Kalman filter & hypothesis config
│ ├── _apply_pipeline.py # Dev utility (notebook patch script)
│ └── data_model/
│ └── model/
│ ├── best_instance_unet.pt # Petri dish model weights (Git LFS)
│ └── best_unet_microfluidics_boundary.pt # Microfluidics model weights (Git LFS)
- Python 3.10 – 3.12 (tested on 3.12.7)
- Git LFS (for downloading model weights)
- ~2 GB disk space for model weights
- NVIDIA GPU recommended but not required (CPU inference is supported)
| Package | Purpose |
|---|---|
torch, torchvision |
Deep learning framework and U-Net inference |
opencv-python (cv2) |
Image/video I/O and preprocessing |
numpy |
Array operations |
pandas |
Tabular data handling |
scikit-image (skimage) |
Connected component labeling, region properties |
scikit-learn |
KMeans clustering |
matplotlib |
Plotting and figure generation |
btrack |
Bayesian multi-object tracking (>= 0.7) |
albumentations |
Image augmentation / preprocessing transforms |
scipy |
Hierarchical clustering, signal filtering |
networkx |
Lineage graph construction |
imageio |
GIF generation |
tqdm |
Progress bars |
Pillow (PIL) |
Image utilities |
ipykernel |
Jupyter notebook kernel |
-
Install Git LFS (required to download model weights):
# macOS brew install git-lfs # Ubuntu/Debian sudo apt-get install git-lfs # Windows (with Git for Windows — LFS is included) # Or download from https://git-lfs.com
Then initialize Git LFS:
git lfs install
# 1. Clone the repository (includes model weights via Git LFS)
git clone https://github.com/SamOliveiraLab/Deepweed.git
cd Deepweed
# 2. Create a conda environment
conda create -n deepweed python=3.12 -y
conda activate deepweed
# 3. Install PyTorch (CPU version — see GPU section below for CUDA)
conda install pytorch torchvision cpuonly -c pytorch -y
# 4. Install remaining dependencies
pip install opencv-python numpy pandas scikit-image scikit-learn \
matplotlib btrack albumentations scipy networkx imageio \
tqdm Pillow ipykernel
# 5. Register the Jupyter kernel
python -m ipykernel install --user --name deepweed --display-name "Deepweed"
# 6. Verify model weights were downloaded (should be ~MB files, not LFS pointers)
ls -lh paper_analysis/data_model/model/
# Expected: two .pt files, each several MB to hundreds of MB# 1. Clone the repository
git clone https://github.com/SamOliveiraLab/Deepweed.git
cd Deepweed
# 2. Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# 3. Install PyTorch (CPU version — see GPU section below for CUDA)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
# 4. Install remaining dependencies
pip install opencv-python numpy pandas scikit-image scikit-learn \
matplotlib btrack albumentations scipy networkx imageio \
tqdm Pillow ipykernel
# 5. Register the Jupyter kernel
python -m ipykernel install --user --name deepweed --display-name "Deepweed"If you have an NVIDIA GPU with CUDA, replace the PyTorch installation step:
# For CUDA 11.8
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
# For CUDA 12.1
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# For CUDA 12.4
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124Or with conda:
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia -yCheck the PyTorch installation page for the latest CUDA options.
The imaging data used in this study is publicly available on Zenodo:
Download the dataset and extract it to a location of your choice (referred to as BASE_PATH in the notebook configuration).
The pipeline expects your imaging data organized as follows:
<BASE_PATH>/
├── data_model/
│ ├── data/
│ │ ├── petri_dish/ # JPEG/PNG image sequence (petri dish mode)
│ │ │ ├── frame_0000.jpeg
│ │ │ ├── frame_0001.jpeg
│ │ │ └── ...
│ │ └── microfluidics/ # AVI video file (microfluidics mode)
│ │ └── duckweed_25_0504_multiple.avi
│ └── model/ # Pretrained weights (included in repo via Git LFS)
│ ├── best_instance_unet.pt
│ └── best_unet_microfluidics_boundary.pt
├── cell_config.json # btrack config (included in repo)
└── paper_analysis_output/ # Created automatically
├── petri_dish/
└── microfluidics/
Note: The model weights are included in this repository via Git LFS. The imaging data (images and video) must be downloaded separately from Zenodo using the link above.
cd Deepweed
# Activate your environment
conda activate deepweed # or: source .venv/bin/activate
# Launch Jupyter
jupyter notebook paper_analysis/Duckweed_Paper_Analysis.ipynbAll parameters are set in Cell 4 (Section 1.2) of the notebook. Edit these before running:
# Choose your dataset type
DATASET_TYPE = "petri_dish" # or "microfluidics"
# Set the base path to your data
BASE_PATH = "/path/to/your/data"
# Optionally limit frames for testing
FRAME_LIMIT = 50 # Set to None for all framesKey parameters by dataset:
| Parameter | Petri Dish | Microfluidics | Description |
|---|---|---|---|
THRESHOLD |
0.2 | 0.2 | Segmentation confidence threshold |
MIN_AREA |
30 | 3 | Minimum frond area (pixels) |
CLUSTER_DISTANCE |
150 | 40 | Spatial clustering distance (pixels) |
N_CLUSTERS |
6 | None (auto) |
Number of clusters |
MINUTES_PER_FRAME |
5 | 20 | Time between frames |
Run the notebook cells in order. The pipeline follows this sequence:
- Part 1 - Setup: imports, configuration, model loading
- Part 2 - Segmentation demo on a single image (sanity check)
- Part 3 - Full pipeline: segmentation + tracking + lineage inference on all frames
- Part 4 - Cluster analysis: group fronds, select regions of interest
- Part 5 - Generate visualizations (population plots, growth curves, timelines)
- Part 6 - Lineage and generation analysis
- Part 7 - Statistical analysis
- Part 8 - Cross-dataset comparison
- Part 9 - Summary of outputs
Checkpointing: After Part 3, you can save a checkpoint (Cell 24) and reload it later (Cell 8) to skip reprocessing.
Input (images/video)
│
▼
┌─────────────────┐
│ U-Net Model │ Semantic segmentation
│ (PyTorch) │ → binary or 3-class mask
└────────┬────────┘
│
▼
┌─────────────────┐
│ Connected │ Instance segmentation
│ Components │ → individual frond masks
└────────┬────────┘
│
▼
┌─────────────────┐
│ btrack │ Bayesian tracking with
│ (Kalman + HMM) │ Kalman filter
└────────┬────────┘
│
▼
┌─────────────────┐
│ Track Stitching│ Merge fragmented tracks
│ + Lineage │ Detect budding events
└────────┬────────┘
│
▼
┌─────────────────┐
│ Cluster │ Group by spatial location
│ Analysis │ (KMeans / hierarchical)
└────────┬────────┘
│
▼
┌─────────────────┐
│ Visualization │ Figures, plots, GIFs,
│ & Statistics │ CSV exports
└─────────────────┘
The pipeline generates the following in OUTPUT_DIR:
| Output | Description |
|---|---|
unified_tracking_data.csv |
All detections with track IDs, areas, centroids, and time |
lineage_data.csv |
Parent-child relationships from budding events |
cluster_*_population_smoothed.csv |
Filtered population counts per cluster |
segmentation_overlay.mp4 |
Video with colored instance masks and bounding boxes |
cluster_timeline_*.gif |
Animated GIF of cluster evolution |
population_plot.png |
Smoothed population count over time |
area_over_time.png |
Individual growth trajectories |
trajectories.png |
Movement paths of tracked fronds |
doubling_time.png |
Doubling time analysis |
generation_distribution.png |
Size distribution by generation |
If .pt files are tiny (~130 bytes) and contain text like version https://git-lfs.github.com/spec/v1, Git LFS did not download them:
git lfs pullThe notebook expects to be run from the paper_analysis/ directory. Either:
- Open the notebook directly from
paper_analysis/ - Or add the path in the notebook before imports:
import sys sys.path.insert(0, '/path/to/Deepweed/paper_analysis')
Make sure you installed btrack >= 0.7:
pip install btrackVerify your PyTorch sees the GPU:
import torch
print(torch.cuda.is_available()) # Should print True
print(torch.cuda.get_device_name(0))If False, reinstall PyTorch with the correct CUDA version (see GPU Support).
The pipeline works on CPU — it will just be slower.
Ensure your OpenCV build supports the video codec:
pip install opencv-python-headless # or: pip install opencv-pythonFor AVI files with specific codecs, you may need system-level codec libraries (e.g., FFmpeg):
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt-get install ffmpeg- Set
FRAME_LIMITto a smaller number in the configuration cell to process fewer frames. - Use CPU instead of GPU if GPU memory is limited (the notebook auto-detects and falls back to CPU).
See the LICENSE file for details.
If you use this tool in your research, please cite the associated paper. (Citation details to be added upon publication.)