Hey there,
While using the mergeMSTs branch, I ran into some trouble with mst and query.
mst
mantis mst doesn't seem to work.
It wants to load eqclass_rr.cls files:
|
eqclass_files = |
|
mantis::fs::GetFilesExt(prefix.c_str(), mantis::EQCLASS_FILE); |
This will later lead to a segmentation fault because the files do not exist.
mantis build will always delete eqclass_rr.cls files at the end:
|
if (opt.remove_colorClasses && !opt.keep_colorclasses) { |
|
for (auto &f : mantis::fs::GetFilesExt(opt.prefix.c_str(), mantis::EQCLASS_FILE)) { |
|
std::cerr << f.c_str() << "\n"; |
|
if (std::remove(f.c_str()) != 0) { |
|
std::cerr << "Unable to delete file " << f << "\n"; |
|
std::exit(1); |
|
} |
|
} |
|
} |
mantis build doesn't have an option to toggle this behavior.
Changing qopt.remove_colorClasses = true; to qopt.remove_colorClasses = false; here, fixes the issue:
|
qopt.prefix = bopt.out; qopt.numThreads = bopt.numthreads; qopt.remove_colorClasses = true; |
query
The default non-bulk query only works if the eqclass_rr.cls files are present and -1 is used:
mantis query -1 -k 20 -p index/ reads.fasta
To have eqclass_rr.cls files, the above fix is needed, and mst must have been run with -k.
Alternatively, bulk-mode (-b) works without the eqclass_rr.cls files. So, mst can also be run with -d.
mantis query -b -k 20 -p index/ reads.fasta
The problem in non-bulk query seems to be that findSamples is called for every query sequence:
|
while (ipfile >> read) { |
|
mstQuery.reset(); |
|
mstQuery.parseKmers(numOfQueries, read, indexK); |
|
mstQuery.findSamples(cdbg, cache_lru, &rs, queryStats, 1); |
|
output_results(mstQuery, opfile, sampleNames, queryStats, 1); |
|
numOfQueries++; |
|
} |
The function then accesses cdbg.get_current_cqf()->keybits():
|
uint64_t ksize{cdbg.get_current_cqf()->keybits()}, numBlocks{cdbg.get_numBlocks()}; |
This works fine for the first query, but for the second one there is no CQF to access because it has been replaced with
an invalid one:
|
cdbg.replaceCQFInMemory(invalid); |
I tried loading the first block 0 at the begin of findSamples and just passing the keybits as an extra parameter.
But then there is an out-of-bounds access at
|
allQueries[q][numSamples]++; |
Hey there,
While using the
mergeMSTsbranch, I ran into some trouble withmstandquery.mst
mantis mstdoesn't seem to work.It wants to load
eqclass_rr.clsfiles:mantis/src/mst.cc
Lines 33 to 34 in 7406e8f
This will later lead to a segmentation fault because the files do not exist.
mantis buildwill always deleteeqclass_rr.clsfiles at the end:mantis/src/mst.cc
Lines 729 to 737 in 7406e8f
mantis builddoesn't have an option to toggle this behavior.Changing
qopt.remove_colorClasses = true;toqopt.remove_colorClasses = false;here, fixes the issue:mantis/src/mantis.cc
Line 308 in 7406e8f
query
The default non-bulk query only works if the
eqclass_rr.clsfiles are present and-1is used:To have
eqclass_rr.clsfiles, the above fix is needed, andmstmust have been run with-k.Alternatively, bulk-mode (
-b) works without theeqclass_rr.clsfiles. So,mstcan also be run with-d.The problem in non-bulk query seems to be that
findSamplesis called for every query sequence:mantis/src/mstQuery.cc
Lines 492 to 498 in 7406e8f
The function then accesses
cdbg.get_current_cqf()->keybits():mantis/src/mstQuery.cc
Line 132 in 7406e8f
This works fine for the first query, but for the second one there is no CQF to access because it has been replaced with
an invalid one:
mantis/src/mstQuery.cc
Line 181 in 7406e8f
I tried loading the first block
0at the begin offindSamplesand just passing thekeybitsas an extra parameter.But then there is an out-of-bounds access at
mantis/src/mstQuery.cc
Line 254 in 7406e8f