Combinatorial Modeling of Perturbation Analysis with Structured Subgraphs
COMPASS is a prototype perturbation prediction model that adapts the GPS (General, Powerful, Scalable) graph transformer architecture to predict transcriptional changes following single-gene CRISPR knockouts. The model constructs a heterogeneous gene regulatory network from OmniPath interaction databases and applies GPS-based encoding to predict delta expression profiles from the Norman et al. (2019) perturbation screen.
This repository accompanies the seminar coding report submitted for the ML4LP course.
compass/
├── models/
│ ├── vanilla_gps/ # Vanilla GPS implementation with MLP prediction head
│ └── extended_gps/ # Extended architecture with GuidedMultiheadAttention and cascade decoder
└── processing/ # Data preprocessing
Dependencies are managed with pixi. To install the environment, run:
pixi installAll dependencies are specified in the pixi.toml file at the root of the repository.
Raw data is sourced from the Norman et al. (2019) CRISPRa perturbation screen. Interaction data is retrieved directly from OmniPathDB at graph construction time. Gene Ontology annotations are sourced from the GEARS publication. Processed data files are not tracked in this repository due to size constraints.
Training logs, evaluation metrics across all grid search configurations, and result visualizations will be added as soon as they are available. A server failure shortly before submission took some training artifacts with it, and regeneration is ongoing.
If you use any part of this work, please cite the GPS paper and GEARS, which this project builds directly on:
- Rampasek et al. (2022). Recipe for a General, Powerful, Scalable Graph Transformer. NeurIPS.
- Roohani et al. (2024). Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nature Biotechnology.