SpA_microbiome_paper_code

This repository contains a comprehensive pipeline for analyzing gut microbiome data from the Spondyloarthritis (SpA) cohort. The workflow includes all major steps of the analysis, including data preprocessing, quality control, taxonomic and functional profiling, statistical modeling, and visualization. The documentation below provides a detailed guide on how to reproduce each step of the analysis.

Software requirements

R version 4.1.2 (2021-11-01) -- "Bird Hippie"

/docs/sessionInfo.txt

Important! Change the working directory before loading the different functions

working_dir <- "~/github_shared_code_and_publications/SpA_microbiome_paper_code"
path_func <- "~/github_shared_code_and_publications/SpA_microbiome_paper_code/functions"

Data

The count tables and minimal metadata (including Disease, Diagnosis, and Disease activity) are available at:

/data/1_infiles

Scripts

All scripts used in the analysis are available in the script directory

2_alpha_diversity

/scripts/2_alpha_diversity/Colon/Diversity_test.R
/scripts/2_alpha_diversity/GMM/Diversity_test.R
/scripts/2_alpha_diversity/Ileum/Diversity_test.R
/scripts/2_alpha_diversity/mOTUs/Diversity_test.R

3_beta_diversity

/scripts/3_beta_diversity/Beta_diversity_colon_biopsies_genus/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_colon_biopsies_sv/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_ileum_biopsies_sv/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_metabolomics/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_gmm_Treatment_response/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_motus/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_motus_Treatment_response/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_gmm/Bray/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_gmm/Canberra/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_kegg/Bray/Beta_diversity.R
/scripts/3_beta_diversity/Beta_diversity_shotgun_kegg/Canberra/Beta_diversity.R

4_biomarkers

Biomarkers

/scripts/4_biomarkers/Biomarkers_response_gogut_prediction/boot632_biomarkers.R
/scripts/4_biomarkers/Biomarkers_response_gogut_stability/glmnet.R
/scripts/4_biomarkers/Cofound_glmnet_biomarkers/glmnet.R

Differential abundance

Run the Differential abundance in the BEGiant dataset

/scripts/4_biomarkers/Biopsies/Biopsies_DA_pipeline.sh
/scripts/4_biomarkers/GMM/Biopsies_DA_pipeline.sh
/scripts/4_biomarkers/mOTUs/Biopsies_DA_pipeline.sh

Differential abundance in the gogut dataset

/scripts/4_biomarkers/Metabolome/Biopsies_DA_pipeline.sh
/scripts/4_biomarkers/GMM_gogut/1_DAT.sh
/scripts/4_biomarkers/mOTUs_gogut/1_DAT.sh

5_Bayesian_network

Bayesian network inference

/scripts/5_network/SpA_Disease/Bayesian_network_pipeline.sh

Bootstrapping and arc strength

The arc strength estimation and Bayesian network bootstrapping were performed using a Sun Grid Engine (SGE) queuing cluster architecture (via a qsub submission script)

/scripts/5_network/SpA_Disease/Cluster_scripts/run_bn_bootstrap_learning.sh

6_mice_experiments

Beta diversity

/scripts/6_mice_experiments/Beta_diversity/Mice_data_Beta_div.R

Biomarkers

/scripts/6_mice_experiments/Biomarkers/WT_diff/WT_diff.R
/scripts/6_mice_experiments/Biomarkers/Corr_W12_metadata/corr_Meta.R

ML predictions The machine learning pipeline used for these analyses is publicly available and fully reproducible at: https://github.com/jorgevazcast/Liver_Disease_Microbiome_ML

Raw data

FASTQ files for this project are available through the European Genome-phenome Archive (EGA) under the following accession numbers:

Study: EGAS50000001435
Dataset: EGAD50000002070
Data Access Committee (DAC): EGAC00001003263

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
data/1_infiles		data/1_infiles
docs		docs
functions		functions
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpA_microbiome_paper_code

Software requirements

R version 4.1.2 (2021-11-01) -- "Bird Hippie"

Data

Scripts

Run the Differential abundance in the BEGiant dataset

Differential abundance in the gogut dataset

The arc strength estimation and Bayesian network bootstrapping were performed using a Sun Grid Engine (SGE) queuing cluster architecture (via a qsub submission script)

Raw data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpA_microbiome_paper_code

Software requirements

R version 4.1.2 (2021-11-01) -- "Bird Hippie"

Data

Scripts

Run the Differential abundance in the BEGiant dataset

Differential abundance in the gogut dataset

The arc strength estimation and Bayesian network bootstrapping were performed using a Sun Grid Engine (SGE) queuing cluster architecture (via a qsub submission script)

Raw data

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages