Skip to content

Genome-Function-Initiative-Oxford/REnformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REnformer

A single-cell ATAC-seq prediction model to investigate open chromatin sites

REnformer is a deep-learning framework that predicts cell-type–specific chromatin accessibility directly from DNA sequence.
Built using transfer learning on top of the Enformer architecture, REnformer significantly improves the prediction of open chromatin regions across 151 human cell types, outperforming Enformer in multiple benchmarking tests.

This repository contains the code and resources accompanying the publication:
"REnformer, a single-cell ATAC-seq predicting model to investigate open chromatin sites".


🚀 Key Features

  • Transformer-based architecture adapted from Enformer
  • Transfer learning using scATAC-seq data
  • Predicts 151 genomic tracks of chromatin accessibility
  • Benchmarking framework (NRMSE-based) for model comparisons
  • Supports mutagenesis and regulatory variant effect prediction
  • Demonstrates correct prediction of known SNP effects (e.g., α-thalassemia promoter-creating variant)

🧠 Model Overview

REnformer keeps the main Enformer architecture:

  • 7 convolutional blocks
  • 11 transformer blocks
  • Cropping + pointwise convolution output layer

Only the final output layers are fine-tuned on scATAC-seq data, enabling efficient training and improved chromatin feature prediction.


📊 Benchmark Summary

REnformer was evaluated using three Normalised RMSE metrics:

  • NRMSE₁ – standard deviation–scaled
  • NRMSE₂ – min-max scaled
  • NRMSE₃ – interquartile range scaled

Across all test subsets:

  • REnformer shows significantly lower prediction error
  • KDE profiles show improved mean and minimum error distribution
  • Edge-case analysis confirms reliability across genomic regions

🧬 Variant Effect Prediction

REnformer correctly identifies regulatory changes introduced by the well-characterised T→C α-thalassemia SNP (chr16:159710), detecting the gain-of-function promoter peak specifically in erythroid cells.

This demonstrates its utility for:

  • Mutagenesis screening
  • SNP interpretation
  • GWAS fine-mapping
  • Non-coding variant prioritisation

📂 Data Sources

Training was performed using 151 scATAC-seq cell types from:

  • Turner et al.
  • Zhang et al.
  • Granja et al.
  • PBMC datasets
  • In-house erythroid and HUVEC experiments

📜 Citation

If you use REnformer, please cite:

Riva et al., 2025
"REnformer: a single-cell ATAC-seq predicting model to investigate open chromatin sites."
IEEE CIBCB 2025.


🔐 License

This project is distributed under the Academic Non-Commercial License (Hybrid CC BY-NC + MIT).

Commercial licensing inquiries should be directed to:

Simone Riva
MRC Weatherall Institute of Molecular Medicine, University of Oxford
Email: simone.riva@imm.ox.ac.uk

Developed in collaboration with Nucleome Therapeutics.


📧 Contact

For further information or inquiries, please contact: simone.riva@imm.ox.ac.uk

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors