[DRAFT] Scripts for building models by tsmbland · Pull Request #937 · EnergySystemsModellingLab/MUSE2

tsmbland · 2025-10-16T09:39:23Z

Description

Please include a summary of the change and which issue is fixed (if any). Please also
include relevant motivation and context. List any dependencies that are required for
this change.

Fixes # (issue)

Type of change

Bug fix (non-breaking change to fix an issue)
New feature (non-breaking change to add functionality)
Refactoring (non-breaking, non-functional change to improve maintainability)
Optimization (non-breaking change to speed up the code)
Breaking change (whatever its nature)
Documentation (improve or add documentation)

Key checklist

All tests pass: $ cargo test
The documentation builds and looks OK: $ cargo doc

Further checks

Code is commented, particularly in hard-to-understand areas
Tests added that prove fix is effective or that feature works

codecov · 2025-10-16T09:40:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.93%. Comparing base (b80ce22) to head (4819637).
⚠️ Report is 428 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #937   +/-   ##
=======================================
  Coverage   84.93%   84.93%           
=======================================
  Files          50       50           
  Lines        5323     5323           
  Branches     5323     5323           
=======================================
  Hits         4521     4521           
  Misses        574      574           
  Partials      228      228

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tsmbland · 2025-10-16T10:12:58Z

@alexdewar @dalonsoa This is extremely rough (ignore the horrendous python code), but gets across the idea I have in mind for building the example models programmatically. The main idea is that you can set a base_model in model.toml, then instead of writing full model csv files you can create a series of diff files that describe how to modify the files in the base model. This is inspired by git diff files (same syntax for addition/deletion of lines), but a bit simpler because

Line order is irrelevant (apart from the first line, obviously)
We expect all lines to be unique

I think it would work well for what we need because most changes to models will involve additions/deletions/changes to rows in the csv, rather than columns, as generally all columns are fixed and mandatory (a few exceptions in the case of optional columns where you might want to add a column which would end up changing all rows - not sure what to do about this). Column order is flexible, so I've included the columns in the diff files so we can check this against the base model for safety.

Downside of this format is that it doesn't work with excel, so maybe we should change the file format slightly to placate excel users, if this is something we want to provide as a feature for users, rather than just developers.

Model changes that involve the toml file (e.g. changing the milestone years) won't be captured here, but this should be easy (e.g. use settings from the base model unless overwritten in the new toml file).

I've built a set of diffs for the example models using a rubbish script which compares two models, which leads to some silly results (e.g. replacing a line to change 1.0 to 1), but gets the idea across. In real life you wouldn't do this - you'd start with your base model, then manually build the diffs on top of this, and obviously you'd only add diffs that are actually meaningful.

Especially for the "missing_commodity" model this makes a lot of sense as it's almost identical to the "simple" model with just a few extra lines, so it doesn't make sense to have a completely separate model with all the data copied, which could potentially get out of sync. "two_regions" is also very similar to "muse1_default". "two_outputs" is a bit more different (based on "simple"), potentially at the point where it's worth being its own model, but I've included the diffs anyway to see what they look like. This would also make a lot of sense for the models in #914 as they're extremely similar to the "simple" model and each other.

If we want to go down this route then we could probably write the code for applying diffs in rust, and support this as a native way of representing a model, i.e. you could run muse2 run directly on a folder of diffs, and it will build the full model under the hood without necessarily having to save the csv files.

What do you think? Good idea? Terrible idea?

alexdewar · 2025-10-16T11:57:44Z

As a general idea, I'm not against it, but I'm a bit worried we might end up making our lives harder here. I can see that as we add more and more examples it could be useful to be able to have one example be a "base model" for another, like you suggest. And I think your approach is basically fine, except that it's a hard problem and so there are drawbacks with pretty much any approach.

It's worth remembering that the example files are currently basically part of the code, as they're bundled into the executable file, and I don't think we should change that. I'm not convinced this is functionality that we want end users to have either, because it will make the input format potentially a bit complex and we might end up having to support it indefinitely. There is a use case for wanting to have some kind of templated input for a parameter sweep or something, but I think that's a bigger problem.

I think there are a few ways to go about it:

Have a script to generate examples from templates, but run it manually and commit the result to the repo (like what we do with regression test data)
Generate the models at compile time with a Python script invoked in build.rs
Generate the models at compile time with pure Rust code in build.rs
Generate the models at runtime

For 1: we'd still have to run a script manually and the examples could get out of sync with the script/template files. We could write a pre-commit hook or something to check, but that's more work. If you don't have a working Python installation, you can't update the examples.

For 2: would mean if you don't have a Python installation you can't build MUSE2 locally or install it with cargo install muse2. Or we just don't bundle some of the examples if Python isn't found, but that also seems problematic.

For 3: a better solution than 2, but you can't import anything from the muse2 crate in build.rs (as build.rs is run first), so if we wanted the code to handle the template files to also be in the final executable, we'd have to split it into a separate package that was also published to crates.io.

For 4: possible but will make the code more complex

I feel like we should probably leave this for now until maintaining the examples is becoming a bit more of a problem. If you have to update a few models at a time I don't think that's such a big deal and it is at least explicit. You might also want to deliberately make different changes to different models and that would be a problem for contributors who don't understand our bespoke template format.

I think this is interesting food for thought though! But maybe we should consider all the options a bit more before committing to anything. We did also talk about having more variations on examples that are exclusively used for regression tests and these problems don't apply in the same way to that, so maybe that's somewhere we could try out this approach.

tsmbland · 2025-10-16T13:22:15Z

Thanks @alexdewar

As a general idea, I'm not against it, but I'm a bit worried we might end up making our lives harder here. I can see that as we add more and more examples it could be useful to be able to have one example be a "base model" for another, like you suggest. And I think your approach is basically fine, except that it's a hard problem and so there are drawbacks with pretty much any approach.

By "hard problem" do you mean difficult or "hard" in the algorithmic sense? (I don't think it's either to be honest but just wondering what you meant)

It's worth remembering that the example files are currently basically part of the code, as they're bundled into the executable file, and I don't think we should change that. I'm not convinced this is functionality that we want end users to have either, because it will make the input format potentially a bit complex and we might end up having to support it indefinitely. There is a use case for wanting to have some kind of templated input for a parameter sweep or something, but I think that's a bigger problem.

I actually think this could be very useful for users, and we should encourage this as an approach. Time and time again I've seen MUSE1 users with 10+ models that are all small deviations of one base model, where they've manually copied the base model and made changes. Problem 1 is that it makes it difficult for others to see what perturbations you're testing with each model, or even difficult for you as the model developer to remember this unless you've clearly documented it. Problem 2 is that these models can get out of sync and end up differing in ways that you didn't intend. If you're toying with the base model, to make a change unrelated to the scenarios you're testing, then unless you remember to make the same change in all the other models, those models are no longer just testing the perturbation that you intended, which can lead to very flawed conclusions. I have seen this happen.

Main point: often the very point of a model is to be a specific deviation from a specific base model, and I think we should give users the language to express this.

RE "making the input format more complex" I don't really think so - it's the same format that we currently have just with a "+" or "-" at the beginning of the row. Any approach will involve some new concepts for the users to learn, but I can't really think of anything simpler than this. The massive benefit is that they don't have to write any code. I've currently saved the diff files as ".diff", but they could just as easily be ".csv" with some extra character in position 0,0 to signify that this is a diff file, or rename them to "assets_diff.csv" etc.

If it's a large parameter sweep this is probably less useful - you'd probably have to write a script anyway whether you're writing diff files or whole models. But I think this is less common, I've never actually seen anyone do this in MUSE1.

I think there are a few ways to go about it:

Have a script to generate examples from templates, but run it manually and commit the result to the repo (like what we do with regression test data)

Generate the models at compile time with a Python script invoked in build.rs

Generate the models at compile time with pure Rust code in build.rs

Generate the models at runtime

For 1: we'd still have to run a script manually and the examples could get out of sync with the script/template files. We could write a pre-commit hook or something to check, but that's more work. If you don't have a working Python installation, you can't update the examples.

For 2: would mean if you don't have a Python installation you can't build MUSE2 locally or install it with cargo install muse2. Or we just don't bundle some of the examples if Python isn't found, but that also seems problematic.

For 3: a better solution than 2, but you can't import anything from the muse2 crate in build.rs (as build.rs is run first), so if we wanted the code to handle the template files to also be in the final executable, we'd have to split it into a separate package that was also published to crates.io.

For 4: possible but will make the code more complex

All good ideas! Given what you've said I think 4 is the best option. I don't think it would make the code much more complex. Maybe a bit fiddly, but once we have the function for merging a base file with a diff file we can use this throughout. I guess it would just take two strings for the source and diff files and return a new string for the merged file.

I feel like we should probably leave this for now until maintaining the examples is becoming a bit more of a problem.

Perhaps, but it is fast becoming a bigger problem.

If you have to update a few models at a time I don't think that's such a big deal and it is at least explicit. You might also want to deliberately make different changes to different models and that would be a problem for contributors who don't understand our bespoke template format.

I mean sure, but I think this is a relatively minor barrier for new contributors in the grand scheme of things

I think this is interesting food for thought though! But maybe we should consider all the options a bit more before committing to anything. We did also talk about having more variations on examples that are exclusively used for regression tests and these problems don't apply in the same way to that, so maybe that's somewhere we could try out this approach.

Great idea! Like what you've suggested in #935 - I don't think it makes sense to add an example model for this, but it would be good to add a test to make sure this is handled properly (probably don't even need to check the results, just make sure it runs to completion, or even just that it passes/fails validation). So yes if we can read in a base model like "simple" and apply changes to it programmatically, that would avoid having to commit an entire set of input files which would be a big win. Maybe I'll start with this then...

alexdewar · 2025-10-16T14:19:00Z

Thanks @alexdewar

As a general idea, I'm not against it, but I'm a bit worried we might end up making our lives harder here. I can see that as we add more and more examples it could be useful to be able to have one example be a "base model" for another, like you suggest. And I think your approach is basically fine, except that it's a hard problem and so there are drawbacks with pretty much any approach.

By "hard problem" do you mean difficult or "hard" in the algorithmic sense? (I don't think it's either to be honest but just wondering what you meant)

I meant the first, but actually I was mostly concerned about making the build system more complicated. If we go with option 4 though then it'll just be a few extra functions in the code, which is not a big deal at all.

It's worth remembering that the example files are currently basically part of the code, as they're bundled into the executable file, and I don't think we should change that. I'm not convinced this is functionality that we want end users to have either, because it will make the input format potentially a bit complex and we might end up having to support it indefinitely. There is a use case for wanting to have some kind of templated input for a parameter sweep or something, but I think that's a bigger problem.

I actually think this could be very useful for users, and we should encourage this as an approach. Time and time again I've seen MUSE1 users with 10+ models that are all small deviations of one base model, where they've manually copied the base model and made changes. Problem 1 is that it makes it difficult for others to see what perturbations you're testing with each model, or even difficult for you as the model developer to remember this unless you've clearly documented it. Problem 2 is that these models can get out of sync and end up differing in ways that you didn't intend. If you're toying with the base model, to make a change unrelated to the scenarios you're testing, then unless you remember to make the same change in all the other models, those models are no longer just testing the perturbation that you intended, which can lead to very flawed conclusions. I have seen this happen.

Main point: often the very point of a model is to be a specific deviation from a specific base model, and I think we should give users the language to express this.

RE "making the input format more complex" I don't really think so - it's the same format that we currently have just with a "+" or "-" at the beginning of the row. Any approach will involve some new concepts for the users to learn, but I can't really think of anything simpler than this. The massive benefit is that they don't have to write any code. I've currently saved the diff files as ".diff", but they could just as easily be ".csv" with some extra character in position 0,0 to signify that this is a diff file, or rename them to "assets_diff.csv" etc.

If it's a large parameter sweep this is probably less useful - you'd probably have to write a script anyway whether you're writing diff files or whole models. But I think this is less common, I've never actually seen anyone do this in MUSE1.

Ok, this is interesting. Maybe it's no bad thing to give users this flexibility, as long as they still have the option to write models with plain old TOML/CSV. It could be in an "advanced' section of the documentation or something.

One thing that's occurred to me is I'm not sure there is a way with the current format to patch the model.toml file itself (i.e. to add/remove/modify params). Is that something we want? One way to do this would be to have a separate template.toml file or whatever that defines base_model and then model.toml can be templated like the CSV files.

...

Great idea! Like what you've suggested in #935 - I don't think it makes sense to add an example model for this, but it would be good to add a test to make sure this is handled properly (probably don't even need to check the results, just make sure it runs to completion, or even just that it passes/fails validation). So yes if we can read in a base model like "simple" and apply changes to it programmatically, that would avoid having to commit an entire set of input files which would be a big win. Maybe I'll start with this then...

Cool, let's try this out for tests first then and see how we get on. There are probably loads of extra tests we could/should add this way. It's a bit of a pain defining all the data structures you need in Rust code, just so you can check that one function works correctly. Would be cool if you could write a test along the lines of "build a model based on the simple example with this one extra line and it should fail validation with this message".

tsmbland · 2025-10-16T14:49:13Z

Ok, this is interesting. Maybe it's no bad thing to give users this flexibility, as long as they still have the option to write models with plain old TOML/CSV. It could be in an "advanced' section of the documentation or something.

Yeah yeah definitely not suggesting we take away that option!

One thing that's occurred to me is I'm not sure there is a way with the current format to patch the model.toml file itself (i.e. to add/remove/modify params). Is that something we want? One way to do this would be to have a separate template.toml file or whatever that defines base_model and then model.toml can be templated like the CSV files.

Yeah we definitely do want this. I think, for example, if you wanted to run the "simple" model with scarcity pricing, ideally you'd just have a folder containing a model.toml file that looks like

base_model = "examples/simple"
pricing_strategy = "scarcity_adjusted"

and it will use all the settings from the base model unless re-defined. If that makes sense

...

Great idea! Like what you've suggested in #935 - I don't think it makes sense to add an example model for this, but it would be good to add a test to make sure this is handled properly (probably don't even need to check the results, just make sure it runs to completion, or even just that it passes/fails validation). So yes if we can read in a base model like "simple" and apply changes to it programmatically, that would avoid having to commit an entire set of input files which would be a big win. Maybe I'll start with this then...

Cool, let's try this out for tests first then and see how we get on. There are probably loads of extra tests we could/should add this way. It's a bit of a pain defining all the data structures you need in Rust code, just so you can check that one function works correctly. Would be cool if you could write a test along the lines of "build a model based on the simple example with this one extra line and it should fail validation with this message".

👍

alexdewar · 2025-10-16T15:33:48Z

Yeah we definitely do want this. I think, for example, if you wanted to run the "simple" model with scarcity pricing, ideally you'd just have a folder containing a model.toml file that looks like
base_model = "examples/simple"
pricing_strategy = "scarcity_adjusted"
and it will use all the settings from the base model unless re-defined. If that makes sense

Yep, makes sense. I just don't know how easy it'll be to get the TOML parsing code to handle this.

It also raises the interesting possibility of having a base model that itself has a base model...

dalonsoa · 2025-11-05T11:25:58Z

I really like the idea, and using a diff syntax like @tsmbland suggest I think makes it pretty clear. I specially see the advantage of using this by end users when they develop slightly different models but still need for them to be consistent. Diverging models by mistake is a really big problem.

The implementation might be more or less complicated but I think that's entirely on us and it will be sorted, one way or another. So I would focus on making sure that the approach is appropriate for end users. For example, I would suggest the following:

Diff files are plain csv files, but with _diff in the name matching the name of the file they are modifying, so they can be handled easily by Excel and other tools.
The sign (+ or -) should be a column in itself, for the same reason, which is dropped when merging the files.
Combined files should not be output unless the user explicitly request them, otherwise there's a risk of diverging inputs if the user starts modifying those files instead.
For cascading base models, I think we just need a recursive functions?

Some initial thoughts, but I definitely see the point of this, maybe more for end users than for the tests, to be honest, although the tests are certainly a beneficiary of this approach.

tsmbland · 2025-12-11T15:05:57Z

Thanks for the comments. Closing this draft, #1026 in progress taking suggestions on board

Rough idea for diff format

4819637

tsmbland requested review from Aurashk and dalonsoa November 3, 2025 11:43

tsmbland closed this Dec 11, 2025

tsmbland mentioned this pull request Dec 16, 2025

Functionality to patch models with changes #1031

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Scripts for building models#937

[DRAFT] Scripts for building models#937
tsmbland wants to merge 1 commit intomainfrom
scripts_for_building_models

tsmbland commented Oct 16, 2025

Uh oh!

codecov bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

tsmbland commented Oct 16, 2025 •

edited

Loading

Uh oh!

alexdewar commented Oct 16, 2025 •

edited

Loading

Uh oh!

tsmbland commented Oct 16, 2025 •

edited

Loading

Uh oh!

alexdewar commented Oct 16, 2025 •

edited

Loading

Uh oh!

tsmbland commented Oct 16, 2025

Uh oh!

alexdewar commented Oct 16, 2025

Uh oh!

dalonsoa commented Nov 5, 2025

Uh oh!

tsmbland commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tsmbland commented Oct 16, 2025

Description

Type of change

Key checklist

Further checks

Uh oh!

codecov bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tsmbland commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexdewar commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsmbland commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexdewar commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tsmbland commented Oct 16, 2025

Uh oh!

alexdewar commented Oct 16, 2025

Uh oh!

dalonsoa commented Nov 5, 2025

Uh oh!

tsmbland commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Oct 16, 2025 •

edited

Loading

tsmbland commented Oct 16, 2025 •

edited

Loading

alexdewar commented Oct 16, 2025 •

edited

Loading

tsmbland commented Oct 16, 2025 •

edited

Loading

alexdewar commented Oct 16, 2025 •

edited

Loading