| title | codingene/nextflow-base (v1.0) | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| author | Developed by Codingene. | ||||||||||||||
| date | Last modified: 19 Jun 2020 | ||||||||||||||
| output |
|
||||||||||||||
| urlcolor | blue |
Currently on this pipeline three most based steps on any Sequence based analysis (starts from fastq files)
- Quality Check (using fastqc)
- Filtering (Using fastp)
- Sequence Read Quantification (Using kallisto)
This can be used as a base to add other process.
- Adapter removal and Filtering of RAW reads using fastp
Pipeline dependency.
- Nextflow on which this workflow framework is based.
- Docker or Conda for tools environment. (It is recommended to use Docker for this workflow.)
This is required only once per system. Check if your system already have it by typing nextflow from any terminal location. If not follow there steps -
curl -s https://get.nextflow.io | bash
mv nextflow usr/bin/Follow this - How to install and use docker on ubuntu
We will use miniconda for this purpose.
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
bash miniconda.sh -b -p $HOME/miniconda3
export PATH="$HOME/miniconda3/bin:$PATH"
rm miniconda.shgit clone https://github.com/codingene/nextflow-base.gitTest is to check if basic components of a workflow is able to run in a system with everything setup properly.
Supposing you are in workflow directory, run following -
nextflow run mian.nf -profile test,dockerNote: Test run may take some time on a first time, because it will download all the tools environment (docker-images/conda-env) automatically in background.
If this success you are good to go on running with your own datasets.
Check help menu
nextflow run path-to/nextflow-base/main.nf --helpThe typical command for running the pipeline is as follows
nextflow run path-to/nextflow-base [arguments] -profile dockerA fasta file directory where all the paired-end reads present.
They must follow this naming convention of *_{1,2}.fastq.gz or *_{1,2}.fq.gz
Path to a cDNA fasta file.
Output folder name. If not given it will create a results named directory on working location. This is where you can find all the results post pipeline run.
For details of individual tool parameters check respective documentation. All are optional with default values (please check bellow)
-
--fastp.length_required(default: 75) -
--fastp.length_limit(default: 151) -
--fastp.qualified_quality_phred(default: 30)
This arguments are optional but recommended to provided with higher numbers as per system configuration and data need.
--max_cpus : [Recommended] Number of threads/CPU to assign (default = 1)
--max_memory : [Recommended] Maximum Memory in GB (default = '2 GB')
--max_time : [Optional] Maximum time for a single step (default = '1h')
|- Sample-Name/ID
|- fastp_filtred_reads
|- fastqc_report
|- kallisto_quantMore information about Changelog (version updates) can be found in NEWS.md