Open
Conversation
Hello, I ran into the "error while loading shared libraries: libiconv.so.2: cannot open shared object file: No such file or directory" error when running chainCleaner manually. I was able to resolve this error by adding the "libiconv" package into my existing conda env. I found it useful to have this yml file on hand.
Merge remote-tracking branch 'upstream/main'
added 3 commits
April 1, 2026 17:36
- Rewrote pipeline from Python orchestration to native Nextflow DSL2 modules and subworkflows - Replaced twobitreader with py2bit to support 64-bit .2bit files (fixes issue hillerlab#56) - Added single Docker/Apptainer container with full UCSC Kent distribution - Scientific parameters split into params.json; nextflow.config is infrastructure-only - Added FROM_FILL_CHAINS and FROM_CLEAN_CHAINS checkpoint entry workflows - Added params.json template and run_nf_slurm_example.sh for multi-pair SLURM runs
- Compute tiers: process_fast (16GB/0.5h), process_medium (50GB/2h) - FA_TO_TWO_BIT raised to 50GB; CHAIN_CLEANER 80GB; REPEAT_FILLER 1h - SLURM partition routing: <=4h to htc, >4h to public - Container image controlled via NXF_CONTAINER_IMAGE env var - LASTZ/AXT_CHAIN/REPEAT_FILLER submit as SLURM job arrays - Added SLURM_SKIP_EPILOG, USR2@180 signal, and beforeScript sleep for LASTZ
- Replace !in with .contains() (invalid Groovy in NF 25.x) - Move subworkflow alias from call site to include statement - Fix DuplicateProcessInvocation: PARTITION -> PARTITION_TARGET / PARTITION_QUERY - Fix all module labels (process_single/high -> process_fast/medium) - Add run_lastz.py and run_lastz_intermediate_layer.py to bin/ for automatic Nextflow staging - Set plain errorStrategy = retry globally - README: three standalone sections, single-source info, Nextflow >= 25.04.6 - Add Changelog v3.0.0 and parameter audit tables to CHANGES_nfcore_refactor.md
462b18e to
96b6511
Compare
…, writer side) The reader fix (twobitreader → py2bit) was already in place, but faToTwoBit was still writing version-0 (32-bit) .2bit files, causing an index overflow abort on genomes with sequences larger than 4 GB. Adding -long writes version-1 (64-bit) files that py2bit can then read correctly. Updated CHANGES_nfcore_refactor.md to document both halves of the issue hillerlab#56 fix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…enomes (issue hillerlab#56, reader side) lastz cannot read v1 (64-bit) .2bit files produced by faToTwoBit -long. run_lastz.py now uses py2bit to extract each partition to a temp FASTA in the task work directory before passing it to lastz. Temp files default to CWD (Nextflow work dir) instead of /tmp, avoiding cross-node filesystem issues on SLURM. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ccess workflow.onComplete now checks if the final chain file actually exists before logging its path. Prints a warning if the pipeline completed successfully but produced no output (e.g. all LASTZ jobs yielded no alignments). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ssue hillerlab#56) py2bit does not actually support v1/64-bit .2bit files produced by faToTwoBit -long, causing RuntimeError on large genomes (>4 GB). Use twoBitToFa (UCSC CLI, already a pipeline dependency) to extract partitions to temp FASTA instead — it supports both v0 and v1 files. Removes py2bit and twobitreader Python dependencies entirely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- project_setup_procedures.py: replace TwoBitFile with twoBitInfo subprocess calls for chrom name listing, chrom sizes, and .2bit format detection - Dockerfile: remove py2bit pip install; no Python .2bit library needed - CHANGES_nfcore_refactor.md: document the full scope of the fix - TODO.md: update done item to reflect actual approach Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Author
|
Good news, we have two test runs that went through successfully using this fork, we are verifying the results against the main branch. But I'm not sure what else to do besides comparing the final result files, what else would you like to check? |
Member
|
Hey @NilaBlueshirt I'd say the easiest would be comparing the final chains but if we want to be more specific i would compare all the outputs from the core alignment (lastz), then the repeat filling and the final output. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
nf-core DSL2 refactor: Nextflow pipeline with containers and checkpoints, for Slurm HPC
Summary
Refactors
make_lastz_chainsfrom a Python-orchestrated pipeline into a standard nf-core-style DSL2 Nextflow pipeline. The originalmake_chains.pyentry point is fully preserved.What's new
Details can be found in
Changelong.md,CHANGES_nfcore_refactor.md, andTODO.mdnextflow.configfor all infrastructure settings,params.jsonfor scientific parameters-longfor farToTwoBit, replacestwobitreaderto support 64-bit.2bitfiles for large genomes (fixes Alignment of a large genome in 2bit #56)environment.yml— conda/mamba environment with all tools for thecondaprofile (we can add pip to it, but uv doesn't work well on HPC)Key files
main.nfnextflow.configparams.jsonDockerfileenvironment.ymlmodules/local/*/main.nfsubworkflows/local/*/main.nfworkflows/make_lastz_chains.nfbin/Backward compatibility
make_chains.pyand all original Python modules are unchanged. The pipeline can still be executed as v2.0.8.Usage
Check the README.md file for 3 different use cases.