The idea
Currently, the p-value thresholds for SNPs selection and locus breaker are configured per study directly in the summary stat input table (p_thresh1 and p_thresh2 columns). Thus, the same thresholds are applied to all phenotypes for a given study.
However, this is not ideal for molecular traits (like eQTLs or pQTLs results) where one may prefer to set different thresholds for each gene / protein.
A possible implementation
- We allow an additional optional input column in the summary stat input table, like
p_thresh_table. This must point to a TSV table with columns: study_id, pheno_id (aka gene_id), p_thresh1, p_thresh2
- We update the locus-breaker R script to accept an extra argument
--p_thresh_table.
- When we load the summary stats, we add 2 columns representing the thresholds. These are populated with fixed values from
--p_thres1 and --p_thres2 then --p_thresh_table is not set, otherwise we use a merge by (study_id, pheno_id) to populate them with information from the p threshold table
- SNPs are filtered based on the p_thresh columns (and then these can be removed eventually)
We have to point out in the docs that when the p-value threshold table is provided, this will take precedence over the fixed values.
The idea
Currently, the p-value thresholds for SNPs selection and locus breaker are configured per study directly in the summary stat input table (
p_thresh1andp_thresh2columns). Thus, the same thresholds are applied to all phenotypes for a given study.However, this is not ideal for molecular traits (like eQTLs or pQTLs results) where one may prefer to set different thresholds for each gene / protein.
A possible implementation
p_thresh_table. This must point to a TSV table with columns: study_id, pheno_id (aka gene_id), p_thresh1, p_thresh2--p_thresh_table.--p_thres1and--p_thres2then--p_thresh_tableis not set, otherwise we use a merge by(study_id, pheno_id)to populate them with information from the p threshold tableWe have to point out in the docs that when the p-value threshold table is provided, this will take precedence over the fixed values.