Impact of shuffling on runtimes

In the nightly run before and after https://github.com/fmidue/modelling-tasks/commit/7489112bdb3017f0196e4dcacce80403c7ee79af there are wildly varying runtimes for some tests (of course, there will always be variation, but here an order of magnitude is involved).

On the four `nightly`-marked tests for `Reach.Deadlock` tasks:
- from 140436ms to 330499ms
- from 4165913ms to 5886167ms
- from 517000ms to 731504ms
- from 13292950ms to 3688968ms

There is also a small impact on the `nightly`-marked `Reach.Reach` test, but it seems negligible (in this particular run).

Of the above times, the first few go up, the last one goes down drastically, by more than two hours. And indeed the overall nightly run went from about 4 hours to about 2 hours overall. It doesn't seem to be a fluke, since there is a pattern in the nightly runs before that day already. The overall timing went from under 2 hours to about 4 hours when https://github.com/fmidue/modelling-tasks/commit/2c88b542055656db6050c055746810094784ab60 was merged and stayed there until https://github.com/fmidue/modelling-tasks/commit/7489112bdb3017f0196e4dcacce80403c7ee79af was merged. In fact, the short-circuiting https://github.com/fmidue/modelling-tasks/blob/7489112bdb3017f0196e4dcacce80403c7ee79af/src/Modelling/PetriNet/Reach/Roll.hs#L201-L202 there was introduced precisely because I wanted to avoid some unnecessary overhead in `Reach.Reach` tasks and in `Reach.Deadlock` tasks with `fusableTransitionsConsumingAreExactly` and `fusableTransitionsProducingAreExactly` each being `Nothing` or `Just 0`.

But I hadn't expected it to have that much of an impact, really, because ultimately all that is saved seems to boil down to these two `shuffleM` calls: https://github.com/fmidue/modelling-tasks/blob/7489112bdb3017f0196e4dcacce80403c7ee79af/src/Modelling/PetriNet/Reach/Roll.hs#L154-L155

That begs the question for further analysis whether the effect on runtime is reproducible.

Moreover, it calls into question the liberal use of `shuffleM` throughout the code base. In particular when it is used "just" to get a few random elements from a list (as is the case above) instead of really wanting to preserve the whole permuted list. Optimization for that sampling case could improve runtime not only for the special cases above but more generally (also when `fusableTransitionsConsumingAreExactly` and `fusableTransitionsProducingAreExactly` are set to positive numbers, and also in other code places).

That might involve switching (in certain places) to another algorithm, e.g., using `sample_` from `list-shuffle` instead of `shuffleM` from `random-shuffle`, and also having to use `MonadSplit` in order to have a generator at hand to pass to `sample_`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Impact of shuffling on runtimes #556

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if requiredFusableTransitionsConsuming == 0 && requiredFusableTransitionsProducing == 0
	then return ([], BM.empty, BM.empty)

	shuffledTransitions <- shuffleM allTransitions
	shuffledPlaces <- shuffleM allPlaces

Impact of shuffling on runtimes #556

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions