Parallel simulation refactoring by bmalezieux · Pull Request #1738 · sbi-dev/sbi

bmalezieux · 2026-01-22T10:45:44Z

This PR refactors the simulation utilities to improve parallel execution efficiency and modularity.

New parallelize_simulator decorator/function: introduces a tool that wraps any simulator to enable parallel execution using joblib.
New simulate_from_thetas utility: added a lightweight wrapper around parallelize_simulator that directly executes simulations for a given set of parameters (thetas).
Refactored simulate_for_sbi (backward compatibility): updated the internal implementation of simulate_for_sbi to leverage the new simulate_from_thetas backend.

janfb

Awesome, looks very good already.

I was wondering whether we would also need to refactor process_simulator accordingly? e.g., we can likely drop the internal torch wrapping and accept numpy simulators (because we wrap to numpy as again). Would have to look into it in detail but I think we can simplify the function a bit.

sbi/utils/simulation_utils.py

… of parallelize_simulator

tomMoral

A few comments.

A question: is it possible to use a return type which is str? It is unclear to me where this would go in the current sbi

tomMoral · 2026-02-24T10:08:54Z

sbi/utils/simulation_utils.py

    seed: Optional[int] = None,
    show_progress_bar: bool = True,
-) -> Tuple[Tensor, Tensor]:
+) -> Tuple[Tensor, Tensor | List[str] | str]:


Shouldn't it be:

Suggested change

) -> Tuple[Tensor, Tensor | List[str] | str]:

) -> Tuple[Theta, X]:

Yes, using the already-defined Theta and X type aliases here would be more consistent and readable, since they already capture all the possible return types.

tomMoral · 2026-02-24T10:34:43Z

sbi/utils/simulation_utils.py

+                    UserWarning,
+                    stacklevel=2,
+                )
+                batches = [theta for theta in thetas]


Why? In this context, we could use the batch parameter of joblib (useful if initialization of the simulator is slow)

Good point. Using joblib's batch parameter would be more efficient here, especially when simulator initialization is expensive. The current loop adds unnecessary overhead.

tomMoral · 2026-02-24T10:38:31Z

sbi/utils/simulation_utils.py

+                )
+                batches = [theta for theta in thetas]
+            else:
+                batches = [theta for theta in thetas]


Unclear why not keep batches = thetas? The current solution duplicates the theta in memory, which can be a large overhead for large number of simulation (probably for later version, right now everything needs to fit in memory).

Agreed. The list comprehension duplicates theta in memory unnecessarily. batches = thetas would avoid that overhead entirely, which becomes significant for large simulation counts.

tomMoral · 2026-02-24T13:14:23Z

sbi/utils/simulation_utils.py

+                num_batches = (
+                    num_simulations + simulation_batch_size - 1
+                ) // simulation_batch_size
+                batches = [


I would use a iterator if possible, once again to avoid filling the memory and duplicating. in particular if this is a tensor, use split?

Using torch.split(thetas, simulation_batch_size) would return views rather than copying data, keeping peak memory constant regardless of dataset size. This is especially important for large-scale workflows.

tomMoral · 2026-02-24T13:15:03Z

sbi/utils/simulation_utils.py

+
+            # Run in parallel
+            # Generate seeds
+            batch_seeds = np.random.randint(low=0, high=1_000_000, size=(len(batches),))


We need to be able to seed this part no?

Agreed, the seed for generating batch_seeds should itself be derived from the user-provided seed argument to ensure full reproducibility. Currently if a user passes seed=42 the batch seed generation is still non-deterministic.

tomMoral · 2026-02-24T14:28:54Z

sbi/utils/simulation_utils.py

+    context = parallel_config(n_jobs=num_workers)
+
+    with context:


Suggested change

context = parallel_config(n_jobs=num_workers)

with context:

with parallel_config(n_jobs=num_workers):

tomMoral · 2026-02-24T14:35:01Z

tests/simulation_utils_test.py

+
+    # 1. No arguments
+    @parallelize_simulator
+    def decorated_sim(theta):


Why not use simple_simulator_torch which is the same?

Good catch. Reusing simple_simulator_torch here would reduce duplication in the test suite and make it clearer that the decorator test is testing the wrapping behaviour rather than a specific simulator implementation.

bmalezieux added 2 commits January 22, 2026 11:43

WIP: parallel simulation refactoring

a4a585f

FIX: type hinting for parallelize_simulator

ac95d90

janfb reviewed Jan 22, 2026

View reviewed changes

sbi/utils/simulation_utils.py Outdated Show resolved Hide resolved

sbi/utils/simulation_utils.py Outdated Show resolved Hide resolved

sbi/utils/simulation_utils.py Outdated Show resolved Hide resolved

bmalezieux added 4 commits January 22, 2026 17:25

FIX: naming

4b27235

UPD: add future warning for deprecation of process_simulator in favor…

b0b28ab

… of parallelize_simulator

FIX: type checking in the minimal example

6e0cb67

DOC: adding a tutorial on the new parallelization API in how to guide

bf73a80

bmalezieux requested a review from janfb January 26, 2026 16:21

bmalezieux marked this pull request as ready for review January 26, 2026 16:21

tomMoral reviewed Feb 24, 2026

View reviewed changes

	) -> Tuple[Tensor, Tensor \| List[str] \| str]:
	) -> Tuple[Theta, X]:

Conversation

bmalezieux commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

janfb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tomMoral left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bmalezieux commented Jan 22, 2026 •

edited

Loading