Skip to content

Incossistent output for big --amplicon values #15

@phlaster

Description

@phlaster

I was messing with --amplicon parameter constructing degenerate primers for my aligned genes. And I found some output inconsistencies. For example, for single alignment file I got following outputs:

No --amplicon parameter:

$ degenprime --degenerate --input_file:NodA_galegae.afa --output_file:NodA.csv
Output details saved to NodA.csv
Program complete.

$ cat NodA.csv 
Pair #,Forward,Reverse,Amplicon,Temp. Diff
1,ACCTATGGCCCAACCGGGAA,ACGGCCAGTCGCTCATTGTG,465,0.291779
2,ACGAGGAGCTGGCATCATTC,CCCAACCGGCACTACAACAA,472,0.622589
3,GAAACTGAGTTGGCTTCCTC,TAAAACGTCTTCGGTTCGAG,476,0.372711
4,GAGCTGGCATCATTCTTTCG,TCCGCAAGGGTAGACTCTAC,403,0.813354
5,TGTCTTCCGGGGTGCATTGG,GACGGCCAGTCGCTCATTGT,552,0.498199

Big --amplicon parameter: (failure type #1, quite reasonable message)

$ degenprime --degenerate --input_file:NodA_galegae.afa --output_file:NodA.csv --amplicon:560
At least one of the forward or reverse primer lists was empty.

Even bigger --amplicon parameter: (failure type #2, wow)

$ degenprime --degenerate --input_file:NodA_galegae.afa --output_file:NodA.csv --amplicon:580
                          %%%%%%%%%%%%%%%%%%%%%%%%                          
                          % Sequence Information %                          
                          %%%%%%%%%%%%%%%%%%%%%%%%                          

                             Sequence Count: 17                             

                Sequence(0001):[NodA|unitig_1_co] Size:[588]                
                Sequence(0002):[NodA|unitig_1_co] Size:[588]                
                Sequence(0003):[NodA|unitig_1_co] Size:[588]                
                Sequence(0004):[NodA|unitig_1_co] Size:[588]                
                Sequence(0005):[NodA|unitig_1_co] Size:[588]                
                Sequence(0006):[NodA|unitig_1_co] Size:[588]                
                Sequence(0007):[NodA|unitig_1_co] Size:[588]                
                Sequence(0008):[NodA|unitig_1_co] Size:[588]                
                Sequence(0009):[NodA|unitig_1_co] Size:[588]                
                Sequence(0010):[NodA|unitig_1_co] Size:[588]                
                Sequence(0011):[NodA|unitig_1_co] Size:[588]                
                Sequence(0012):[NodA|unitig_2_co] Size:[588]                
                Sequence(0013):[NodA|unitig_1_co] Size:[588]                
                Sequence(0014):[NodA|unitig_1_co] Size:[588]                
                Sequence(0015):[NodA|unitig_1_co] Size:[588]                
                Sequence(0016):[NodA|unitig_1_co] Size:[588]                
                Sequence(0017):[NodA|unitig_1_co] Size:[588]                

                            %%%%%%%%%%%%%%%%%%%%%                           
                            % Conserved Regions %                           
                            %%%%%%%%%%%%%%%%%%%%%                           

                           %%%%%%%%%%%%%%%%%%%%%%                           
                           % Consensus Sequence %                           
                           %%%%%%%%%%%%%%%%%%%%%%                           

                       Conserved regions capitalized.                       
 (00000-00059): ATGTCTTCCGGGGTGCATTGGAAATTACATTGGGAAACTGAGTTGGCTTCCTCCGACCAC
 (00060-00119): GAGGAGCTGGCATCATTCTTTCGAAATACCTATGGCCCAACCGGGAAGTTTAACGCCAAA
 (00120-00179): CCCTTCGAGGATGGTCGTAGCTGGGCCGGCGCACGGCCTGAGCTTCGCGCCATTGCCTAC
 (00180-00239): GATTCCAAGGGAATAGCCGGTCATCTAGGGTTGTTACGGCGTTTCATCAGAGTGGGTGAG
 (00240-00299): ACAGAAGTACTTGTGGCTGAGTTGGGGTTATATGGTGTTCGACCGGATTTAGAAAAATTG
 (00300-00359): GGCATCGCTCACTCCATTCGAGCCATGGCTCCGGTCGTGGACGACCTTGGCGTGCCTTTC
 (00360-00419): GCATTCGGAACTGTGCGATACGCGATGCGAAATCACATCGAGAGATTCTGCAGGGATGGC
 (00420-00479): GCGGCAAATATCGTGTCCGGCATTCGAGTAGAGTCTACCCTTGCGGATGTCTATCGTGAC
 (00480-00539): TGCCCGGCCACTCGAACCGAAGACGTTTTAGTTGTTGTAGTGCCGGTTGGGCGCACAATG
 (00540-00587): AGCGACTGGCCGTCGGGGTCCCTGATACagcgacgcgggccggaacta

There were insufficient primers found for this data.                        
Insufficient primers found for this data.

Now for another alignment file I got:
No --amplicon parameter:

$ degenprime --degenerate --input_file:ActR_galegae.afa --output_file:ActR.csv
Output details saved to ActR.csv
Program complete.

$ cat ActR.csv 
Pair #,Forward,Reverse,Amplicon,Temp. Diff
1,TGACCCTTCGCTGCTGATCG,GTCACGGCGGTDGCGATATT,269,0.708435
2,CTGACCCTTCGCTGCTGATC,GGCTTGGCGAGATAGTCGAG,309,0.446228
3,ACACACCCGGAACCGACCAA,ACGTTGCGCTCGCACATCTC,493,0.168365
4,CGCTGCTGATCGTCGATGAC,CGCMGACATCGGGTTTTCCG,382,0.349274
5,GCTGCTGATCGTCGATGACG,CGCTCGCACATCTCGTAGAC,431,0.605988

Big --amplicon parameter: (failure type #3, quite reasonable message again, but different from #1)

$ degenprime --degenerate --input_file:ActR_galegae.afa --output_file:ActR.csv --amplicon:540
No primer pairs were found for these specifications.
Output details saved to ActR.csv
Program complete.

Even bigger --amplicon parameter: (failure type #2, this again)

$ degenprime --degenerate --input_file:ActR_galegae.afa --output_file:ActR.csv --amplicon:550
                          %%%%%%%%%%%%%%%%%%%%%%%%                          
                          % Sequence Information %                          
                          %%%%%%%%%%%%%%%%%%%%%%%%                          

                             Sequence Count: 18                             

                Sequence(0001):[ActR|unitig_0_87] Size:[582]                
                Sequence(0002):[ActR|unitig_0_87] Size:[582]                
                Sequence(0003):[ActR|unitig_0_87] Size:[582]                
                Sequence(0004):[ActR|unitig_0_87] Size:[582]                
                Sequence(0005):[ActR|unitig_0_87] Size:[582]                
                Sequence(0006):[ActR|unitig_0_87] Size:[582]                
                Sequence(0007):[ActR|unitig_0_87] Size:[582]                
                Sequence(0008):[ActR|unitig_0_87] Size:[582]                
                Sequence(0009):[ActR|unitig_0_87] Size:[582]                
                Sequence(0010):[ActR|unitig_0_87] Size:[582]                
                Sequence(0011):[ActR|unitig_0_87] Size:[582]                
                Sequence(0012):[ActR|unitig_0_87] Size:[582]                
                Sequence(0013):[ActR|unitig_0_37] Size:[582]                
                Sequence(0014):[ActR|unitig_0_87] Size:[582]                
                Sequence(0015):[ActR|unitig_0_87] Size:[582]                
                Sequence(0016):[ActR|unitig_0_87] Size:[582]                
                Sequence(0017):[ActR|unitig_0_87] Size:[582]                
                Sequence(0018):[ActR|unitig_0_87] Size:[582]                

                            %%%%%%%%%%%%%%%%%%%%%                           
                            % Conserved Regions %                           
                            %%%%%%%%%%%%%%%%%%%%%                           

                           %%%%%%%%%%%%%%%%%%%%%%                           
                           % Consensus Sequence %                           
                           %%%%%%%%%%%%%%%%%%%%%%                           

                       Conserved regions capitalized.                       
 (00000-00059): ---------ATGGAAACACACCCGGAACCGACCAAGGTTCATGCCGACCCCGAACTCGGG
 (00060-00119): CCTGACCCTTCGCTGCTGATCGTCGATGACGACGGVCCGTTCCTGCGHCGGCTGGCnCGV
 (00120-00179): GCSATGGAGACCCGCGGCTTCCTBGTCGAHACGGCGGAGTCCGTCGCGGAAGGTATCGCH
 (00180-00239): AAGACVAAGGCGCGGCCGCCGAAATATGCVGTGGTCGACCTGCGBCTCGGCGACGGCAAC
 (00240-00299): GGDCTGGAYGTVATCGAAGCDATCCGCCAGAGCCGCGAGGAYACCAAGGTGATCGTGCTG
 (00300-00359): ACBGGCTACGGCAATATCGCHACCGCCGTGACGGCVGTGAAGCTCGGGGCGCTCGACTAT
 (00360-00419): CTCGCCAAGCCBGCBGACGCCGACGACATHTTYGGCGCDCTGACVCAGCGGCCGGGCGAG
 (00420-00479): CGGGCDGACGTGCCGGAAAACCCGATGTCKGCGGATCGCGTGCGCTGGGAACATATCCAG
 (00480-00539): CGBGTCTACGAGATGTGCGAGCGCAACGTBTCSGAGACGGCVCGCCGGCTCAACATGCAT
 (00540-00581): CGCCGCACGCTGCAGCGCATCCtcgccaagcgcgcvccgaaa

There were insufficient primers found for this data.                        
Insufficient primers found for this data.

So when primer construction fails there are at least 3 different kinds of output, which is somewhat confusing. I attached the files discussed so that you look at them.

alignments.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions