Turn off the ironing out loop by default by tsmbland · Pull Request #1170 · EnergySystemsModellingLab/MUSE2

tsmbland · 2026-03-04T14:02:34Z

Description

Turns off the ironing out loop by default (max_ironing_out_iterations = 1), as requested by @ahawkes, but maintains a regression test for a patched version of the simple model with the loop turned on (iterations = 10, the previous default). To do this I had to modify the patching API to allow patching the example model's model.toml (this will also be useful for #1105)

Turns out for the "simple" example this doesn't have any impact, but for others it does

Fixes #1165

Type of change

Bug fix (non-breaking change to fix an issue)
New feature (non-breaking change to add functionality)
Refactoring (non-breaking, non-functional change to improve maintainability)
Optimization (non-breaking change to speed up the code)
Breaking change (whatever its nature)
Documentation (improve or add documentation)

Key checklist

All tests pass: $ cargo test
The documentation builds and looks OK: $ cargo doc
Update release notes for the latest release if this PR adds a new feature or fixes a bug
present in the previous release

Further checks

Code is commented, particularly in hard-to-understand areas
Tests added that prove fix is effective or that feature works

codecov · 2026-03-04T14:05:27Z

Codecov Report

❌ Patch coverage is 93.75000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.35%. Comparing base (04b86f7) to head (55ef9ae).

Files with missing lines	Patch %	Lines
src/cli/example.rs	85.71%	0 Missing and 1 partial ⚠️
src/example/patches.rs	96.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
+ Coverage   87.30%   87.35%   +0.04%     
==========================================
  Files          57       57              
  Lines        7706     7719      +13     
  Branches     7706     7719      +13     
==========================================
+ Hits         6728     6743      +15     
+ Misses        673      671       -2     
  Partials      305      305

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR changes the default behavior of the model “ironing out” loop by reducing max_ironing_out_iterations from 10 to 1, and updates expected regression outputs accordingly. It also extends the patched-example mechanism so a patched example can optionally include a model.toml patch (intended to preserve coverage of the loop-enabled configuration via a patched “simple” model).

Changes:

Set the default max_ironing_out_iterations to 1 (and update the input schema default accordingly).
Extend the patched-example registry to include an optional TOML patch, adding a simple_ironing_out patched example that sets iterations back to 10.
Refresh a large set of regression “golden” CSV outputs to match the new default behavior; add new golden outputs for simple_ironing_out.

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/data/two_regions/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/two_regions/assets.csv	Updated expected asset outputs under the new default iteration count.
tests/data/two_outputs/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/two_outputs/assets.csv	Updated expected asset outputs under the new default iteration count.
tests/data/simple_npv/assets.csv	Updated expected asset outputs for the NPV patched scenario under the new default.
tests/data/simple_ironing_out/commodity_prices.csv	New golden output for the patched “simple” model with ironing-out enabled (iterations=10).
tests/data/simple_ironing_out/commodity_flows.csv	New golden output for the patched “simple” model with ironing-out enabled (iterations=10).
tests/data/simple_ironing_out/assets.csv	New golden output for the patched “simple” model with ironing-out enabled (iterations=10).
tests/data/simple/debug_solver.csv	Updated debug solver expectations reflecting fewer iterations by default.
tests/data/simple/debug_dispatch_assets.csv	Updated debug dispatch expectations reflecting fewer iterations by default.
tests/data/simple/debug_commodity_balance_duals.csv	Updated debug dual expectations reflecting fewer iterations by default.
tests/data/simple/debug_appraisal_results.csv	Updated appraisal debug expectations reflecting fewer iterations by default.
tests/data/muse1_default/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/muse1_default/commodity_flows.csv	Updated expected commodity flow outputs under the new default iteration count.
tests/data/muse1_default/assets.csv	Updated expected asset outputs under the new default iteration count.
tests/data/circularity/commodity_prices.csv	Updated expected commodity price outputs under the new default iteration count.
tests/data/circularity/assets.csv	Updated expected asset outputs under the new default iteration count.
src/model/parameters.rs	Changes the default `max_ironing_out_iterations` from 10 to 1.
src/example/patches.rs	Extends patch registry to include optional TOML patch; adds `simple_ironing_out` patch entry.
src/cli/example.rs	Updates CLI patched-example extraction to apply file patches + optional TOML patch.
schemas/input/model.yaml	Updates schema default for `max_ironing_out_iterations` to 1.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot · 2026-03-04T14:18:43Z

src/cli/example.rs

 fn extract_example(name: &str, patch: bool, dest: &Path) -> Result<()> {
    if patch {
-        let patches = get_patches(name)?;
+        let (file_patches, toml_patch) = get_patches(name)?;


get_patches(name)? returns a reference to a (Vec<FilePatch>, Option<String>), but this code destructures it as if it were an owned tuple. This won’t compile (type mismatch &(…) vs (…)). Either bind the returned reference and access its fields, or change get_patches to return an owned tuple (e.g., cloned) so destructuring works.

Suggested change

let (file_patches, toml_patch) = get_patches(name)?;

let (file_patches, toml_patch) = get_patches(name)?.clone();

Well it does compile...

src/example/patches.rs

tsmbland · 2026-03-04T14:24:50Z

@alexdewar Now getting slightly different results on macos which is concerning. I seem to remember something like this has happened before...?

alexdewar · 2026-03-04T16:36:34Z

@alexdewar Now getting slightly different results on macos which is concerning. I seem to remember something like this has happened before...?

These differences are pretty large too:

thread 'two_regions' (11401) panicked at tests/regression.rs:78:5:
The following errors occurred:
  * commodity_flows.csv: line 751:
    + "2045,8,gas,all-year.night,-0.86000992"
    - "2045,8,gas,all-year.night,-1.39307392"
  * commodity_flows.csv: line 752:
    + "2045,8,electricity,all-year.night,0.514976"
    - "2045,8,electricity,all-year.night,0.834176"
  * commodity_flows.csv: line 753:
    + "2045,8,CO2f,all-year.night,47.20784992"
    - "2045,8,CO2f,all-year.night,76.46891392"
  * commodity_flows.csv: line 955:
    + "2045,25,gas,all-year.night,-0.5330640000000001"
    - "2045,25,gas,all-year.night,-0.0"
  * commodity_flows.csv: line 956:
    + "2045,25,electricity,all-year.night,0.3192000000000001"
    - "2045,25,electricity,all-year.night,0.0"
  * commodity_flows.csv: line 957:
    + "2045,25,CO2f,all-year.night,29.261064000000008"
    - "2045,25,CO2f,all-year.night,0.0"

Not sure what's going on here. If you want to compare the macOS results with e.g. the Windows ones, we upload them as test artifacts you can see on the summary page: https://github.com/EnergySystemsModellingLab/MUSE2/actions/runs/22673536177?pr=1170

tsmbland · 2026-03-05T11:24:28Z

The difference concerns electricity supply in the night timeslice in 2045 by two assets (8 and 25):

on mac: 0.32 electricity supply by asset 25, 0.51 by asset 8 (= 0.83 total)
on windows/ubuntu: 0 electricity supply by asset 25, 0.83 by asset 8 (= 0.83 total)

These are both gasCCGT assets in R2, and although the commission year is different, the parameters for these assets are identical. Therefore, as far as the objective is concerned, the two solutions are equally good. It's probably the case for a lot of optimisations that there's no single optimal solution, rather a space of equally good solutions, but 99.9% of the time highs seems to pick the same solution every time, regardless of OS (I think this always ends up being a "corner" solution, i.e. where one or more parameters in the space is at its limit. The two solutions above are different corner solutions where supply by asset 25 is at its upper and lower limit respectively).

I don't know the factors that cause it to choose one optimum over another, but it seems that tiny floating-point differences between platforms can have an impact (perhaps tilting the energy landscape slightly so that one solution becomes ever-so-slightly favourable over another, or affecting pivot decisions in the optimisation path so it lands on a different vertex).

The best solution I can think of is to modify the objective so that we can guarantee a single optimum solution. One idea would be L2 regularisation, adding a small epsilon-scaled term to the objective to minimise the sum of squares of the variables. Generally this favours smaller values over large "extreme" values, which for our purposes means spreading activity over multiple assets rather than allocating all activity to a single asset - probably not a bad thing to aim for in any case. (In this particular example, the mac solution would be the single unique optimum).

In practice I'm not sure how easy this would be. highs-sys has Highs_passHessian which I believe makes this possible, but this isn't exposed in the highs crate we're using. I believe it's also incompatible with integer variables, which we are using in some situations, but I think we could circumvent this with a two-step approach. This is all based on a conversation with AI so take with a pinch of salt.

Apart from that I don't really have any other ideas...

tsmbland · 2026-03-05T11:37:41Z

Annoying that this PR is getting blocked by something completely unrelated! For the time being, do you think it would be ok to turn off this particular regression test until we can get this fixed? What else can we do...?

tsmbland · 2026-03-05T11:39:39Z

Or revert this particular example model back to using the ironing out loop and update it later?

alexdewar · 2026-03-05T12:50:43Z

The difference concerns electricity supply in the night timeslice in 2045 by two assets (8 and 25):

on mac: 0.32 electricity supply by asset 25, 0.51 by asset 8 (= 0.83 total)

on windows/ubuntu: 0 electricity supply by asset 25, 0.83 by asset 8 (= 0.83 total)

These are both gasCCGT assets in R2, and although the commission year is different, the parameters for these assets are identical. Therefore, as far as the objective is concerned, the two solutions are equally good. It's probably the case for a lot of optimisations that there's no single optimal solution, rather a space of equally good solutions, but 99.9% of the time highs seems to pick the same solution every time, regardless of OS (I think this always ends up being a "corner" solution, i.e. where one or more parameters in the space is at its limit. The two solutions above are different corner solutions where supply by asset 25 is at its upper and lower limit respectively).

I don't know the factors that cause it to choose one optimum over another, but it seems that tiny floating-point differences between platforms can have an impact (perhaps tilting the energy landscape slightly so that one solution becomes ever-so-slightly favourable over another, or affecting pivot decisions in the optimisation path so it lands on a different vertex).

This sounds plausible.

The best solution I can think of is to modify the objective so that we can guarantee a single optimum solution. One idea would be L2 regularisation, adding a small epsilon-scaled term to the objective to minimise the sum of squares of the variables. Generally this favours smaller values over large "extreme" values, which for our purposes means spreading activity over multiple assets rather than allocating all activity to a single asset - probably not a bad thing to aim for in any case. (In this particular example, the mac solution would be the single unique optimum).

In practice I'm not sure how easy this would be. highs-sys has Highs_passHessian which I believe makes this possible, but this isn't exposed in the highs crate we're using. I believe it's also incompatible with integer variables, which we are using in some situations, but I think we could circumvent this with a two-step approach. This is all based on a conversation with AI so take with a pinch of salt.

Apart from that I don't really have any other ideas...

This is an interesting idea, but probably quite a lot of work and, as you say, out of scope for this PR anyway.

As an aside, you can actually call the functions in the lower-level highs-sys crate by getting the pointer from your Model struct and passing that in (it'll have to be in an unsafe block), but if we know we want this functionality it would be cleaner to open a PR for the highs crate adding it (I did that to get the objective value thing added).

Annoying that this PR is getting blocked by something completely unrelated! For the time being, do you think it would be ok to turn off this particular regression test until we can get this fixed? What else can we do...?

How about we just skip this test on non-x86 platforms for now?

This should work:

diff --git a/tests/regression.rs b/tests/regression.rs
index 41c933b9..4307bc3e 100644
--- a/tests/regression.rs
+++ b/tests/regression.rs
@@ -26,6 +26,9 @@ define_regression_test!(circularity);
 // Patched examples
 define_regression_test_with_patches!(simple_divisible);
 define_regression_test_with_patches!(simple_npv);
+
+// We get different results on ARM Macs, for reasons that aren't clear
+#[cfg(target_arch = "x86_64")]
 define_regression_test_with_patches!(simple_ironing_out);
 
 // ------  END: regression tests  ------

It would be better if we opened an issue and linked to it in the comment. It would be nice if the examples gave similar results on all the platforms we care about, but it probably shouldn't be a high priority to fix this. But let's keep track of it anyway!

tsmbland · 2026-03-05T13:19:08Z

I'm going to hold this until #1173. That will be another major change to all of the example models, and there's every chance that that will magically fix the issue here. (Or, it could just as likely introduce more issues just like this...)

In any case, this does reveal an underlying issue, so I've opened #1174

tsmbland added 5 commits March 4, 2026 13:45

Allow toml patching for regression tests

758d3bc

Add patch for the simple model with the ironing out loop turned on

8c33b9e

Change default to 1 iteration (no loop)

fabf5ce

Update results files

cff3255

Fix typo

767205e

tsmbland marked this pull request as ready for review March 4, 2026 14:13

Copilot AI review requested due to automatic review settings March 4, 2026 14:13

Copilot started reviewing on behalf of tsmbland March 4, 2026 14:14 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Add regression test for new model

55ef9ae

tsmbland requested review from Aurashk and alexdewar March 4, 2026 14:25

tsmbland mentioned this pull request Mar 5, 2026

Potential for different dispatch results on different platforms #1174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turn off the ironing out loop by default#1170

Turn off the ironing out loop by default#1170
tsmbland wants to merge 6 commits intomainfrom
ironing_out_off

tsmbland commented Mar 4, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

tsmbland Mar 4, 2026

Uh oh!

Uh oh!

tsmbland commented Mar 4, 2026

Uh oh!

alexdewar commented Mar 4, 2026

Uh oh!

tsmbland commented Mar 5, 2026 •

edited

Loading

Uh oh!

tsmbland commented Mar 5, 2026

Uh oh!

tsmbland commented Mar 5, 2026

Uh oh!

alexdewar commented Mar 5, 2026

Uh oh!

tsmbland commented Mar 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	let (file_patches, toml_patch) = get_patches(name)?;
	let (file_patches, toml_patch) = get_patches(name)?.clone();

Conversation

tsmbland commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Key checklist

Further checks

Uh oh!

codecov bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

tsmbland Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tsmbland commented Mar 4, 2026

Uh oh!

alexdewar commented Mar 4, 2026

Uh oh!

tsmbland commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsmbland commented Mar 5, 2026

Uh oh!

tsmbland commented Mar 5, 2026

Uh oh!

alexdewar commented Mar 5, 2026

Uh oh!

tsmbland commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tsmbland commented Mar 4, 2026 •

edited

Loading

codecov bot commented Mar 4, 2026 •

edited

Loading

tsmbland commented Mar 5, 2026 •

edited

Loading

tsmbland commented Mar 5, 2026 •

edited

Loading