Assigning chromatin status to predefined genomic regions from epigenomic profiling data
ChromCall is an R package for region-based chromatin enrichment analysis of epigenomic profiling data, including ChIP-seq, CUT&RUN, CUT&Tag, and ATAC-seq. It provides a transparent, statistically principled framework to quantify enrichment at predefined genomic regions (e.g. promoters or enhancers), enabling region-matched comparisons across samples and experiments without relying on data-dependent peak boundaries.
-
Region-centric analysis
Quantifies chromatin enrichment directly within predefined genomic windows, enabling consistent, region-matched comparisons across samples and experiments. -
Transparent statistical framework
Employs a Poisson-based background model incorporating:- experiment-specific genome-wide background estimation
- region-specific, control-derived modulation factors
-
Control-aware enrichment testing
Explicitly integrates matched control experiments to account for both technical background and local biological variability. -
Multiple complementary metrics
For each region and experiment, ChromCall reports:- FDR-adjusted p-values
- enrichment score (log₂ observed / expected)
- Poisson z-score
- binary chromatin status (present / absent)
-
Multi-experiment and multi-sample support
Supports joint analysis of multiple chromatin marks and pairwise comparisons between samples within a unified framework. -
Optional expression integration
Region-level gene expression values (e.g. TSS-associated expression) can be incorporated to enable integrated chromatin–transcription analyses.
ChromCall models read counts as Poisson-distributed events, an appropriate approximation for sparse and independent fragment occurrences across fixed genomic windows. After accounting for genome-wide background signal and local, control-derived modulation, this framework enables transparent and analytically tractable region-level inference.
For each experiment, a genome-wide background rate (
where (
Zero-count tiles are retained by default to avoid upward bias in sparse datasets and to ensure that (
To account for region-specific biological variability, ChromCall derives a modulation factor from the matched control experiment:
The expected signal for region i in experiment j is then defined as:
This formulation integrates global sequencing depth with local control variation into a unified and interpretable background model, while preventing deflation in regions with low control signal.
ChromCall evaluates region-level enrichment using a one-sided Poisson test:
Multiple testing correction is applied across all regions using the Benjamini–Hochberg false discovery rate (FDR) procedure.
In addition to significance testing, ChromCall reports complementary effect-size metrics:
- Enrichment score
- Poisson z-score
Together, these metrics provide complementary measures of enrichment strength, effect size, and statistical confidence.
ChromCall is implemented in R and builds upon the Bioconductor ecosystem, ensuring interoperability with standard genomic data structures and downstream analysis workflows:
GRangesfor representing genomic intervalsSummarizedExperimentfor storing structured assay outputs and metadataGenomicAlignmentsfor importing aligned sequencing reads from BAM filesGenomeInfoDbandSeqinfofor genome annotation and consistency checks
Each processed sample is returned as a SummarizedExperiment object containing:
- raw region-level read counts
- genome-wide and locally adjusted background estimates ((
$\lambda_g$ ), ($\lambda_t$ )) - p-values and FDR-adjusted p-values
- enrichment scores and Poisson z-scores
Pairwise sample comparisons generate region-level Δ enrichment and Δ z-score metrics, enabling direct comparative analysis of chromatin states across biological conditions.
ChromCall is available as a development version on GitHub and can be installed using remotes:
# install.packages("remotes")
remotes::install_github("GliomaGenomics/ChromCall")sample <- build_chromcall_sample(
sample_name = "sampleA",
experiments = list(
H3K27me3 = "h3k27me3.bam",
H3K4me3 = "h3k4me3.bam",
Control = "control.bam"
),
control_name = "Control",
genome_file = "genome.txt",
region_file = "promoters.bed",
window_size = 2000,
blacklist_file = "blacklist.bed",
expression_file = "expression_tss.bed"
)result <- test_region_counts(sample)comparison <- compare_samples(resultA, resultB, threshold = 0.05)write_experiment_results(result, "H3K4me3", "results.tsv")
write_comparison_results(comparison, "comparison.tsv")| Metric | Description |
|---|---|
counts |
Raw read count per region |
lambda_g |
Genome-wide background rate |
lambda_t |
Locally adjusted expected signal |
p_value, p_adj |
Poisson test p-values and FDR-adjusted values |
score |
log₂(Observed / Expected) enrichment |
z_pois |
Poisson z-score |
DeltaEnrichment, DeltaZscore |
Pairwise comparison metrics |
For questions, issues, or feature requests, please open a 👉 GitHub issue