Skip to content

CIGAR strings with multiple operations not parsed properly #10

@knowah

Description

@knowah

I have a BAM file with reads that contain an insertion at a specific position. I noticed that the basecall counts in the --output file produced by rrbsSnp (and subsequently, the genotypes in the VCF file created by BS-Snper.pl) were inaccurate around this insertion.

The CIGAR string for these reads has three operations (e.g., 143M1I7M), but it appears that only the last operation is being processed, regardless of the number of CIGAR operations in a read. I confirmed this by checking the value of record->cigar and record->len at the end of sam_funcs.c:parseBuffer() - in the example above, the values are 7M and 7, respectively, meaning that only 7 bases are included in the MapRecord for this read (instead of 150), and the basecalls are recorded in the wrong reference positions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions