We ran polygenic risk score (PRS) calculations using 1000 Genome Project's Omni Microarray genotype data on a population of 2318 individuals and show the ancestry and PRS results here. The calculation and related data preprocessing are done by the Michigan Imputation Server.
- 1000 Genome Genotype Data
- To download, run
curl -O ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/hd_genotype_chip/ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.vcf.gz
- To download, run
- 1000 Genome Phase 3 v5 Reference Panel for running imputation & PRS calculations
After downloading the Omni genotype data hosted by 1000 Genome, run the following to dissect the whole genome VCF file into ones arranged by chromosomes:
for chr in {1..22}; do
bcftools view -r ${chr} -Oz -o chr${chr}.vcf.gz \
ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.vcf.gz
done
Upload the prepared per-chromosome vcf.gz files to the Michigan Imputation Server and select configurations accordingly (see below).
- Genotype data is prepared according to genome build GRCh37/hg19.
- Use 1000G Phase 3 v5 as reference panel for the job.
- Feel free to adjust rsq filter and trait category for polygenic scoring.
- For the benchmark results shown in this repo, trait category is Cancer.