Skip to content

[BUG] --repeats argument can't be empty? #9

@gerverska

Description

@gerverska

Hey there! I'm working with v26.2.12 from the bioconda build, which seems to be ahead of the docs. Currently, there is an argument for --repeats listed under --help.

Optional arguments:
  -p , --proteins          protein alignments in GFF3 format [accepts multiple files: space separated] (default: [])
  -t , --transcripts       transcripts alignments in GFF3 format [accepts multiple files: space separated] (default: [])
  -r , --repeats           repeat alignments in BED or GFF3 format
  -w , --weights           user supplied source weights [accepts multiple: space separated source:weight] (default: [])
  -n , --num-processes     number of processes to use for parallel execution (default: number of CPU cores)
  -m , --minscore          minimum score to retain gene model (default: auto)
  --repeat-overlap         percent gene model overlap with repeats to remove (default: 90)
  --min-exon               minimum exon length (default: 3)
  --max-exon               maximum exon length (default: -1)
  --min-intron             minimum intron length (default: 10)
  --max-intron             maximum intron length (default: -1)
  -l , --logfile           write logs to file
  --silent                 do not write anything to terminal/stderr (default: False)
  --debug                  write/keep intermediate files (default: False)

Unlike the other optional arguments, it currently does not have a (default: []). I think this suggests an expectation for a --repeats argument that should not be there, and I've gotten this output from a test run:

[Mar 12 08:48 AM]: Python v3.12.12; GFFtk v26.2.12; numpy v1.26.4; natsort v8.4.0; interlap v0.2.6
[Mar 12 08:48 AM]: Namespace(subparser_name='consensus', fasta='test.fasta', genes=['test1.gff3', 'test2.gff3'], out='consensus.gff3', proteins=[], transcripts=[], repeats=None, weights=[], num_processes=None, minscore=None, repeat_overlap=90, min_exon=3, max_exon=-1, min_intron=10, max_intron=-1, logfile=None, silent=False, debug=False)
Traceback (most recent call last):
  File "/path/to/env/bin/gfftk", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/path/to/env/lib/python3.12/site-packages/gfftk/__main__.py", line 21, in main
    consensus(args)
  File "/path/to/env/lib/python3.12/site-packages/gfftk/consensus.py", line 29, in consensus
    check_inputs([args.fasta] + args.genes + args.proteins + args.transcripts + [args.repeats])
  File "/path/to/env/lib/python3.12/site-packages/gfftk/utils.py", line 40, in check_inputs
    if not is_file(filename):
           ^^^^^^^^^^^^^^^^^
  File "/path/to/env/lib/python3.12/site-packages/gfftk/utils.py", line 45, in is_file
    if os.path.isfile(f):
       ^^^^^^^^^^^^^^^^^
  File "<frozen genericpath>", line 30, in isfile
TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

It looks like repeats, but also num_processes, minscore, and logfile are read in as having None when empty, while other optional arguments return the expected []. These then seem to produce the TypeError at the end of the output. I'll take a stab fixing consensus.py, but not sure when I'll get around to it...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions