A single-cell ATAC-seq prediction model to investigate open chromatin sites
REnformer is a deep-learning framework that predicts cell-type–specific chromatin accessibility directly from DNA sequence.
Built using transfer learning on top of the Enformer architecture, REnformer significantly improves the prediction of open chromatin regions across 151 human cell types, outperforming Enformer in multiple benchmarking tests.
This repository contains the code and resources accompanying the publication:
"REnformer, a single-cell ATAC-seq predicting model to investigate open chromatin sites".
- Transformer-based architecture adapted from Enformer
- Transfer learning using scATAC-seq data
- Predicts 151 genomic tracks of chromatin accessibility
- Benchmarking framework (NRMSE-based) for model comparisons
- Supports mutagenesis and regulatory variant effect prediction
- Demonstrates correct prediction of known SNP effects (e.g., α-thalassemia promoter-creating variant)
REnformer keeps the main Enformer architecture:
- 7 convolutional blocks
- 11 transformer blocks
- Cropping + pointwise convolution output layer
Only the final output layers are fine-tuned on scATAC-seq data, enabling efficient training and improved chromatin feature prediction.
REnformer was evaluated using three Normalised RMSE metrics:
- NRMSE₁ – standard deviation–scaled
- NRMSE₂ – min-max scaled
- NRMSE₃ – interquartile range scaled
Across all test subsets:
- REnformer shows significantly lower prediction error
- KDE profiles show improved mean and minimum error distribution
- Edge-case analysis confirms reliability across genomic regions
REnformer correctly identifies regulatory changes introduced by the well-characterised T→C α-thalassemia SNP (chr16:159710), detecting the gain-of-function promoter peak specifically in erythroid cells.
This demonstrates its utility for:
- Mutagenesis screening
- SNP interpretation
- GWAS fine-mapping
- Non-coding variant prioritisation
Training was performed using 151 scATAC-seq cell types from:
- Turner et al.
- Zhang et al.
- Granja et al.
- PBMC datasets
- In-house erythroid and HUVEC experiments
If you use REnformer, please cite:
Riva et al., 2025
"REnformer: a single-cell ATAC-seq predicting model to investigate open chromatin sites."
IEEE CIBCB 2025.
This project is distributed under the Academic Non-Commercial License (Hybrid CC BY-NC + MIT).
Commercial licensing inquiries should be directed to:
Simone Riva
MRC Weatherall Institute of Molecular Medicine, University of Oxford
Email: simone.riva@imm.ox.ac.uk
Developed in collaboration with Nucleome Therapeutics.
For further information or inquiries, please contact: simone.riva@imm.ox.ac.uk