Skip to content

PP field logic in phase_rare VCF output #124

@creepyorange

Description

@creepyorange

Hello!
Thank you for developing and maintaining this incredibly useful tool.
I am using shapeit5 for phasing whole-genome sequencing data from a large cohort, with a particular focus on the phasing of rare variants. I am trying to understand the PP (phasing confidence) field in the output VCF.
As far as I understood the PP field can often be missing (.), and I want to figure out the precise conditions under which a non-missing PP value is assigned.
My initial assumption was that a non-missing PP would require a heterozygous genotype (e.g., 0|1 or 1|0). However, some homozygous reference genotypes (0|0) have a PP value of 1 while other homozygous reference genotypes have a missing (.) PP field.

Could you please clarify the logic behind the assignment of the PP tag?

shapeit version: 5.1.1

Command used: phase_rare_static --input {input.to_phase} --scaffold {input.scaffold} --map {params.gmap}
--input-region {wildcards.interval} --scaffold-region {wildcards.interval} --output {output.rare_chunk} --thread {threads}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions