CREST-GV - Cell types Ranking and Enrichment Score for selecTive Genetic Variants

CREST-GV is a method which allows querying our described data collection of ~500 cell types (we keep piling more data to add to the data collection) to determine the enrichment score of a set of genetic variants. CREST-GV rely on peak properties for each cell type in the data collection leveraging LanceOtron peak caller. CREST-GV can also query your personal (in-house) data where formatted correctly (see In-house data format section for a properly formatted data structure), in this case, the user can rely on the peak caller of their own choice. The only mandatory input for the tool is the path of a genetic file, stored following the format shown in Genetic variant file format section.

Getting started

CREST-GV can be uesed and run using the crestgv conda envrironment. Please follow the installation instruction detailed below.

Installation instructions for conda environment

This section rely on the assumption that any distribution of Conda (e.g., Anaconda, Miniconda, ...) or Mamba is already installed.

Clone the repository

git clone git@github.com:Genome-Function-Initiative-Oxford/CREST-GV.git
cd CREST-GV

Create Anaconda environment

Activate the conda 'base' environment (if not active):

conda activate base

There are two ways to create the conda env crestgv environment:

Using mamba (if Mamba is installed), and follow the on screen instructions:

mamba env create --file=envs/crestgv.yml

Using conda, and follow the on screen instructions.

conda env create --file=envs/crestgv.yml

Activate the environment

Now, the crestgv environment is created it needs to be activated:

conda activate crestgv

You can then use CREST-GV using this environment, enjoy!

Environment installation note

CREST-GV has been successfully tested for the following operating systems: Ubuntu, CentOS, macOS (Intel CPU), and Windows. Unfortunately, it is not possible to install on macOS with M CPUs at the moment. For any error in the installation step, please open an issue so we can give a general solution for users.

Reproducibility 🔁

If required for publication, package versions within the environment can be exported as follows:

conda activate crestgv
conda env export > crestgv_environment_versions.yml

How to use CREST-GV

from a code editor (e.g., VS Code) or a web-based interactive computing (e.g., Jupyter Notebook):

import pandas as pd
import sys
sys.path.append('crestgv/')
from crestgv import crestgv

genetic = "<Genetic path>"
cvg     = crestgv(genetic=genetic, output="<Name output directory>", collection_name="<Collection name>")
df_cvg  = cvg.calculate_enrichment_score()

You can find some helpful parameter information using:

cgv = crestgv(genetic=genetic)
help(cgv)

Genetic variant file format

Any genetic variant file provided to CREST-GV has to be tab (\t) separated and must contain at least 3 columns:

"CHR_ID" : chromosome in the follwoing format chrV
"CHR_POS" : chromosome position
"SNPS" : ID of the variant (e.g., rs#####, or chrV-pos-ref-alt)

For example like:

CHR_ID  CHR_POS SNPS
chr10   801748  rs60692108
chr10   823912  rs74876360
chr10   840700  chr10-840700-A-C

In-house data format

All the data has to be stored in a folder called with your collection name , following the tree-like format example below (see example_files/in-house-data folder example), where in-house-data is the collection name.

example_files/in-house-data/
├── bigwigs
│   ├── cell_type_1.bw
│   ├── cell_type_2.bw
│   └── cell_type_3.bw
├── in-house-data_info.csv
└── peaks
    ├── cell_type_1.bed
    ├── cell_type_2.bed
    └── cell_type_3.bed

Pipeline updates 🚧

If any changes are made to CREST-GV, it is possible to update the repository by entering the main folder and pulling the update using:

# Enter the main folder
cd CREST-GV

# Pull updates
git pull

Alternatively, remove the cloned repository and then re-clone the repository as described above.
Warning: use rm carefully!

rm -rf CREST-GV

⚠️ Warning for University of Oxford CCB users ⚠️

When using this repository, use the default terminal and do not load any module in the server (if logged-in).

Contact us

If you have any suggestions, spot any errors, or have any questions regarding the pipelines, please do no hesitate to contact us anytime.

📧 simone.riva@imm.ox.ac.uk

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
crestgv		crestgv
envs		envs
example_files		example_files
our_collection		our_collection
LICENSE		LICENSE
README.md		README.md
crestgv.py		crestgv.py
run_test.ipynb		run_test.ipynb
run_test.py		run_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CREST-GV - Cell types Ranking and Enrichment Score for selecTive Genetic Variants

Getting started

Installation instructions for conda environment

Clone the repository

Create Anaconda environment

Activate the environment

Environment installation note

Reproducibility 🔁

How to use CREST-GV

from a code editor (e.g., VS Code) or a web-based interactive computing (e.g., Jupyter Notebook):

Genetic variant file format

In-house data format

Pipeline updates 🚧

⚠️ Warning for University of Oxford CCB users ⚠️

Contact us

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CREST-GV - Cell types Ranking and Enrichment Score for selecTive Genetic Variants

Getting started

Installation instructions for conda environment

Clone the repository

Create Anaconda environment

Activate the environment

Environment installation note

Reproducibility 🔁

How to use CREST-GV

from a code editor (e.g., VS Code) or a web-based interactive computing (e.g., Jupyter Notebook):

Genetic variant file format

In-house data format

Pipeline updates 🚧

⚠️ Warning for University of Oxford CCB users ⚠️

Contact us

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages