The storage function can lead to very long DAG creation times when it is pointing to online zip files.
The following example shows it quite clearly.
Snakefile:
rule retrieve_eurostat_data:
input:
storage(
"https://ec.europa.eu/eurostat/documents/38154/4956218/Balances-April2023.zip",
),
When running snakemake -n, the DAG creation takes longer than two minutes (direct download time via browser ~20 seconds)
I don't know whether it is related to the fact, that snakemake runs the download multiple times even though it is in dry-run mode?
Let me know if there is a way to support or if you need more information/context.
The storage function can lead to very long DAG creation times when it is pointing to online
zipfiles.The following example shows it quite clearly.
Snakefile:
When running
snakemake -n, the DAG creation takes longer than two minutes (direct download time via browser ~20 seconds)I don't know whether it is related to the fact, that
snakemakeruns the download multiple times even though it is in dry-run mode?Let me know if there is a way to support or if you need more information/context.