Currently, when there are many projects in the current group, it may take a while to process everything. Due to the fact that all projects are loaded via the paginated API before any processing can happen, it can take a while before the processing begins and before any output is visible to the user.
This could be improved by introducing a manager-worker pattern where the projects of each page are pushed into a queue from which one or several workers pull the projects and process them. As a consequence, it's faster overall and the time to first output is shorter.
There are a few questions and challenges that come to mind:
- It's probably necessary to use https://hackage.haskell.org/package/concurrent-output to make sure that the output isn't messed up
- This applies to almost all commands, so it makes sense to solve this "properly"
- An obvious question is whether or not there should be multiple workers and if so, how many. A single worker will already improve the performance, too many workers will lead to a clogged up network
- It might be tempting to add multiple manager-worker setups in the same command. I don't know yet whether this is a good idea
- Maybe that's also something that should be backed by a new sub-package in the gitlab-api project: Instead of
fetchDataPaginated, there could be something like fetchDataEnqueueing
Currently, when there are many projects in the current group, it may take a while to process everything. Due to the fact that all projects are loaded via the paginated API before any processing can happen, it can take a while before the processing begins and before any output is visible to the user.
This could be improved by introducing a manager-worker pattern where the projects of each page are pushed into a queue from which one or several workers pull the projects and process them. As a consequence, it's faster overall and the time to first output is shorter.
There are a few questions and challenges that come to mind:
fetchDataPaginated, there could be something likefetchDataEnqueueing