Skip to content

Add a mechanism to allow integer to specify orders of magnitude suffixes like k, m, g, t? #165

@SGSSGene

Description

@SGSSGene

The original question is: Should sharg provide a datatype that can also be specified with additional magnitude suffixes like k, m, g, t?

Motivation: Currently I am trying to do integrate automatic CWL generation into sharg. For this, sharg needs to apply a mapping from a app-dev provided variable/type to a CWL-type. (This is actually not only needed by CWL, but also the man and help page generator are using this.)
As my first working block I am using raptor as a reference and checking how to map its parameter to CWL.
There are a few candidates that report improper types.

Troubling arguments
For ./raptor build -hh these are:

--window (raptor::window) // Note: Solved, see below.
          The window size. Default: k-mer size. Value must be a positive integer.
--shape (std::string)
          The shape to use for k-mers. Mutually exclusive with --kmer. Default: . Value must match the pattern
          '[01]+'.
--size (std::string)
          The size in bytes of the resulting index. Default: 1k. Must be an integer followed by [k,m,g,t] (case
          insensitive).
--output (std::filesystem::path)
          Provide an output filepath or an output directory if --compute-minimiser is used.

For ./raptor search -hh:

 --pattern (raptor::pattern_size) // Note: Solved, see below.
          The pattern size. Default: Median of sequence lengths in query file.
 --output (std::filesystem::path)
          Provide a path to the output.

Details:

  1. ✔️ --output For CWL I need to differentiate between input-file, output-file, input-directory, output-directory.
    To decide if something is an input or an output, we can just take a peek at the validators. Problem solved.
    For ./raptor build specifically, we also need to decide if the output is a directory or a file. This will be solved by moving the --compute-minimser as a new command. considered solved
  2. ✔️ --window and --pattern. These arguments are strong types. This was done to override a custom default message. This has been solved by [FEATURE] Add default_message #109 which allows specifying a custom default message and the strong type is not required anymore. considered solved
  3. ✔️ shape: I have no clue what to do with this one. But it is very special and a string should suffice, considered solved for now
  4. --size: This is what this discussion is about!

This option seems to be something that maybe a lot of other apps need. So should sharg provide a way of doing this out of the box?

  • If we answer no: There is nothing left to do.
  • If we say yes: How do we want to this?

What are our goals?

  • Easy use for app-dev
  • Minimally invasive
  • Maintainability
  • Extensibility
  • Readability
  • Encapsulation

API suggestions
Here are different API-suggestions on how this could look like for app-devs. It does not go into detail on how the implementation looks like:

// Idea 1:
uint64_t size{30};
myparser.add_option(sharg::enable_length_suffix{size}, sharg::config{.long_id = "size");
// Idea 2:
std::string size{"30"};
myparser.add_option(size, sharg::config{.long_id = "size", .underlying_type = size_t{}); // Maybe we colud also encode that in the validator?
// Idea 3:
sharg::enable_length_suffix size{30};
myparser.add_option(size, sharg::config{.long_id = "size"); 

Exploring

  • Could we use this mechanism also for input/output-files/directories?
    Answer: yes, this works with Idea1 and Idea3. We just add a sharg::output_file or sharg::output_directory type.

  • Could we use this mechanism instead of validators?
    Answer: yes, parsing a string and converting it to a type always includes an implicit validation.
    Maybe a validator is the wrong abstraction, alternatively we could annotate a parser, this would also allow many more possibilities, but removes the possibility of chaining validators (does anyone uses this functionality?):

// Idea 4
size_t size{30};
myparser.add_option(size, sharg::config{.long_id = "size", .parser = sharg::large_number_parser); 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions