rnahub orchestrates iterative nhmmer/Infernal/R-Scape searches for discovering new RNA homologs.
The CLI wires common structural-RNA tools, organizes job folders, and captures diagnostics so you can re-run specific pipeline stages with minimal bookkeeping.
For the webserver, go to http://rnahub.org
(databases for now are not included)
Magnus, M., Gao, W., Dutta, N., Vicens, Q., Rivas, E.
RNAhub - An automated pipeline to search and align RNA homologs with secondary structure assessment
Nucleic Acids Research 2025
https://academic.oup.com/nar/article/53/W1/W496/8121641
- Launch nhmmer searches (v0–v3) with automatic job-folder management.
- Run optional steps such as R-Scape, Infernal, RepeatMasker, or flanked-search generation without retooling commands.
- Resume work by skipping completed stages, or run in
--drymode to inspect commands before execution. - Designed for workstation runs as well as scheduled jobs on SLURM clusters.
Install Python requirements that are not bundled with the standard library. At
the moment only icecream is needed:
python -m pip install icecream # installs into the Python you are usingCreate config_local.py in the project root to override paths from
config.py. Required entries point to your nhmmer, Infernal, R-Scape, and
auxiliary binaries:
nhmmer = "/home/rnahub/opt/hmmer/hmmer-3.3.2/bin/nhmmer"
RSCAPE_PATH = "/home/rnahub/opt/rscape/rscape_v2.5.2/bin/R-scape"
EASEL_PATH = "/home/rnahub/opt/hmmer/hmmer-3.3.2/bin/"
INFERNAL_PATH = "/usr/bin/" # used for cmalign
REPEAT_MASKER_PATH = "/home/rnahub/opt/RepeatMasker"
# Optional Rfam resources
RFAM_FILES = "/home/rnahub/rnahub/rfam"
RFAM_DB_PATH = "/home/rnahub/db/rfam/Rfam.cm"Any variable omitted from config_local.py falls back to the defaults provided
in config.py.
Example session for the FAS RC environment:
module load python/3.10.12-fasrc01
python -m pip install --user icecream- Prepare an input
.fa/.fastasequence or.stoalignment. - (Optional) Pre-filter the search space using BLAST against NR to focus on a taxonomic group. Add flanking regions if needed.
- Run the pipeline, e.g.:
python rnahub.py \
--input example/seq.fa \
--db db/fungi.fasta \
--job-name seq_run \
--job-folder jobs/seq_run \
--cpus 8Set --slurm to emit SLURM submission scripts, --dry to preview commands,
and pass --dev-skip-* switches to resume from intermediate stages (for
instance --dev-skip-nhmmer123 to skip later nhmmer rounds).
When re-running inside a finished job folder:
python rnahub.py --job-folder . --dev-skip-nhmmer123Add flanks via either:
--flanked path/to/flanks.fa(FASTA only, not.fasta).--flanks-in-headerwhere headers follow><seq_name> <start>-<end>.
$ python rnahub.py -h
usage: rnahub.py [-h] [--db DB [DB ...]] [--job-name JOB_NAME] [--verbose] [-v] [--slurm]
[--evalue EVALUE] [--evalue-for-flanking-search EVALUE_FOR_FLANKING_SEARCH]
[--lmin LMIN] [--evalue-final EVALUE_FINAL]
[--iteractions ITERACTIONS] [--cpus CPUS] [--dry] [--repeatmasker] [--utot]
[--job-folder JOB_FOLDER] [--dev-skip-search] [--dev-skip-nhmmer0]
[--dev-skip-nhmmer123] [--dev-skip-rscape] [--dev-skip-infernal]
[--rscape-path RSCAPE_PATH] [--rscape] [--input INPUT [INPUT ...]]
[--flanked FLANKED] [--flanks-in-header] [--flanks-start FLANKS_START]
[--flanks-end FLANKS_END]
optional arguments:
-h, --help show this help message and exit
--db DB [DB ...] databases (.cm/.sto) searched by nhmmer
--job-name JOB_NAME defaults to the input file stem
--verbose increase logging
-v, --version print the version based on the latest Git tag
--slurm submit to SLURM instead of running locally
--evalue EVALUE e-value threshold for all runs except the final one
--evalue-for-flanking-search EVALUE_FOR_FLANKING_SEARCH
e-value threshold for flanking (v0) search; defaults to --evalue
--lmin LMIN `esl-alimanip` minimum length (default 50)
--evalue-final EVALUE_FINAL
e-value threshold for the last round
--iteractions ITERACTIONS
number of nhmmer iterations
--cpus CPUS nhmmer CPU count
--dry show commands without executing them
--repeatmasker run RepeatMasker preprocessing
--utot treat sequences as DNA (U→T) for RepeatMasker
--job-folder JOB_FOLDER
destination folder (default `jobs/<name>/<input>`)
--dev-skip-search skip v0–v3 nhmmer searches
--dev-skip-nhmmer0 skip only the v0 nhmmer search
--dev-skip-nhmmer123 skip v1–v3 nhmmer searches
--dev-skip-rscape skip the R-Scape stage
--dev-skip-infernal skip Infernal stage
--rscape-path RSCAPE_PATH
override R-Scape path
--rscape run only the R-Scape step
--input INPUT [INPUT ...]
FASTA/FA or Stockholm input sequences/alignments
--flanked FLANKED supply a FASTA with flanking regions
--flanks-in-header parse `><seq start-end` spans from headers
--flanks-start FLANKS_START
left flank coordinate
--flanks-end FLANKS_END
right flank coordinate
- Run BLAST on NR before nhmmer to scope the search space (e.g., fungi).
- Add flanks to recover context for hits.
- Keep a cache of pre-compiled genomes for faster iterative runs.
- Inspect
.helixcovfiles produced by newer R-Scape versions.
Gao, W., Jones, T. A. & Rivas, E. Discovery of 17 conserved structural RNAs in fungi.
Nucleic Acids Res 49, gkab355 (2021). https://academic.oup.com/nar/article/49/11/6128/6292099