Replies: 3 comments
-
|
Another problem - big summary statistics to read. data.table::fread() creates huge temporary file (as big as the original one) while reading, that then gets deleted automatically (IF the read is successfull). Implementations in place to mitigate this:
Questions:
|
Beta Was this translation helpful? Give feedback.
-
|
The new version of the munging script now uses An open question remains about experimenting with the job array, but it's not a priority anymore. |
Beta Was this translation helpful? Give feedback.
-
|
We agree to replace all |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
We experimented with issues related to a massive number of files generated when the pipeline is run on large-scale datasets.
This is likely due to a combination of excessive scattering across processes and data. Table temp files generated during writing.
How can we overcome this?
There are different possible strategies we can explore:
Beta Was this translation helpful? Give feedback.
All reactions