Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #937 +/- ##
=======================================
Coverage 84.93% 84.93%
=======================================
Files 50 50
Lines 5323 5323
Branches 5323 5323
=======================================
Hits 4521 4521
Misses 574 574
Partials 228 228 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@alexdewar @dalonsoa This is extremely rough (ignore the horrendous python code), but gets across the idea I have in mind for building the example models programmatically. The main idea is that you can set a
I think it would work well for what we need because most changes to models will involve additions/deletions/changes to rows in the csv, rather than columns, as generally all columns are fixed and mandatory (a few exceptions in the case of optional columns where you might want to add a column which would end up changing all rows - not sure what to do about this). Column order is flexible, so I've included the columns in the diff files so we can check this against the base model for safety. Downside of this format is that it doesn't work with excel, so maybe we should change the file format slightly to placate excel users, if this is something we want to provide as a feature for users, rather than just developers. Model changes that involve the toml file (e.g. changing the milestone years) won't be captured here, but this should be easy (e.g. use settings from the base model unless overwritten in the new toml file). I've built a set of diffs for the example models using a rubbish script which compares two models, which leads to some silly results (e.g. replacing a line to change 1.0 to 1), but gets the idea across. In real life you wouldn't do this - you'd start with your base model, then manually build the diffs on top of this, and obviously you'd only add diffs that are actually meaningful. Especially for the "missing_commodity" model this makes a lot of sense as it's almost identical to the "simple" model with just a few extra lines, so it doesn't make sense to have a completely separate model with all the data copied, which could potentially get out of sync. "two_regions" is also very similar to "muse1_default". "two_outputs" is a bit more different (based on "simple"), potentially at the point where it's worth being its own model, but I've included the diffs anyway to see what they look like. This would also make a lot of sense for the models in #914 as they're extremely similar to the "simple" model and each other. If we want to go down this route then we could probably write the code for applying diffs in rust, and support this as a native way of representing a model, i.e. you could run What do you think? Good idea? Terrible idea? |
|
As a general idea, I'm not against it, but I'm a bit worried we might end up making our lives harder here. I can see that as we add more and more examples it could be useful to be able to have one example be a "base model" for another, like you suggest. And I think your approach is basically fine, except that it's a hard problem and so there are drawbacks with pretty much any approach. It's worth remembering that the example files are currently basically part of the code, as they're bundled into the executable file, and I don't think we should change that. I'm not convinced this is functionality that we want end users to have either, because it will make the input format potentially a bit complex and we might end up having to support it indefinitely. There is a use case for wanting to have some kind of templated input for a parameter sweep or something, but I think that's a bigger problem. I think there are a few ways to go about it:
For 1: we'd still have to run a script manually and the examples could get out of sync with the script/template files. We could write a pre-commit hook or something to check, but that's more work. If you don't have a working Python installation, you can't update the examples. For 2: would mean if you don't have a Python installation you can't build MUSE2 locally or install it with For 3: a better solution than 2, but you can't import anything from the For 4: possible but will make the code more complex I feel like we should probably leave this for now until maintaining the examples is becoming a bit more of a problem. If you have to update a few models at a time I don't think that's such a big deal and it is at least explicit. You might also want to deliberately make different changes to different models and that would be a problem for contributors who don't understand our bespoke template format. I think this is interesting food for thought though! But maybe we should consider all the options a bit more before committing to anything. We did also talk about having more variations on examples that are exclusively used for regression tests and these problems don't apply in the same way to that, so maybe that's somewhere we could try out this approach. |
|
Thanks @alexdewar
By "hard problem" do you mean difficult or "hard" in the algorithmic sense? (I don't think it's either to be honest but just wondering what you meant)
I actually think this could be very useful for users, and we should encourage this as an approach. Time and time again I've seen MUSE1 users with 10+ models that are all small deviations of one base model, where they've manually copied the base model and made changes. Problem 1 is that it makes it difficult for others to see what perturbations you're testing with each model, or even difficult for you as the model developer to remember this unless you've clearly documented it. Problem 2 is that these models can get out of sync and end up differing in ways that you didn't intend. If you're toying with the base model, to make a change unrelated to the scenarios you're testing, then unless you remember to make the same change in all the other models, those models are no longer just testing the perturbation that you intended, which can lead to very flawed conclusions. I have seen this happen. Main point: often the very point of a model is to be a specific deviation from a specific base model, and I think we should give users the language to express this. RE "making the input format more complex" I don't really think so - it's the same format that we currently have just with a "+" or "-" at the beginning of the row. Any approach will involve some new concepts for the users to learn, but I can't really think of anything simpler than this. The massive benefit is that they don't have to write any code. I've currently saved the diff files as ".diff", but they could just as easily be ".csv" with some extra character in position 0,0 to signify that this is a diff file, or rename them to "assets_diff.csv" etc. If it's a large parameter sweep this is probably less useful - you'd probably have to write a script anyway whether you're writing diff files or whole models. But I think this is less common, I've never actually seen anyone do this in MUSE1.
All good ideas! Given what you've said I think 4 is the best option. I don't think it would make the code much more complex. Maybe a bit fiddly, but once we have the function for merging a base file with a diff file we can use this throughout. I guess it would just take two strings for the source and diff files and return a new string for the merged file.
Perhaps, but it is fast becoming a bigger problem.
I mean sure, but I think this is a relatively minor barrier for new contributors in the grand scheme of things
Great idea! Like what you've suggested in #935 - I don't think it makes sense to add an example model for this, but it would be good to add a test to make sure this is handled properly (probably don't even need to check the results, just make sure it runs to completion, or even just that it passes/fails validation). So yes if we can read in a base model like "simple" and apply changes to it programmatically, that would avoid having to commit an entire set of input files which would be a big win. Maybe I'll start with this then... |
I meant the first, but actually I was mostly concerned about making the build system more complicated. If we go with option 4 though then it'll just be a few extra functions in the code, which is not a big deal at all.
Ok, this is interesting. Maybe it's no bad thing to give users this flexibility, as long as they still have the option to write models with plain old TOML/CSV. It could be in an "advanced' section of the documentation or something. One thing that's occurred to me is I'm not sure there is a way with the current format to patch the
Cool, let's try this out for tests first then and see how we get on. There are probably loads of extra tests we could/should add this way. It's a bit of a pain defining all the data structures you need in Rust code, just so you can check that one function works correctly. Would be cool if you could write a test along the lines of "build a model based on the |
Yeah yeah definitely not suggesting we take away that option!
Yeah we definitely do want this. I think, for example, if you wanted to run the "simple" model with scarcity pricing, ideally you'd just have a folder containing a base_model = "examples/simple"
pricing_strategy = "scarcity_adjusted"and it will use all the settings from the base model unless re-defined. If that makes sense
👍 |
Yep, makes sense. I just don't know how easy it'll be to get the TOML parsing code to handle this. It also raises the interesting possibility of having a base model that itself has a base model... |
|
I really like the idea, and using a diff syntax like @tsmbland suggest I think makes it pretty clear. I specially see the advantage of using this by end users when they develop slightly different models but still need for them to be consistent. Diverging models by mistake is a really big problem. The implementation might be more or less complicated but I think that's entirely on us and it will be sorted, one way or another. So I would focus on making sure that the approach is appropriate for end users. For example, I would suggest the following:
Some initial thoughts, but I definitely see the point of this, maybe more for end users than for the tests, to be honest, although the tests are certainly a beneficiary of this approach. |
|
Thanks for the comments. Closing this draft, #1026 in progress taking suggestions on board |
Description
Please include a summary of the change and which issue is fixed (if any). Please also
include relevant motivation and context. List any dependencies that are required for
this change.
Fixes # (issue)
Type of change
Key checklist
$ cargo test$ cargo docFurther checks