Scripts to generate tutorial models#279
Conversation
dalonsoa
left a comment
There was a problem hiding this comment.
I love it! This sort of wizard would make both the tutorials more robust to changes and life much easier for people (specially newcomers). Power users can always edit things manually, but the entry level should be as low as possible.
I'm more than happy with this approach. Let's see what @alexdewar says.
On the input data, I totally agree that the current structure of CSV files is horrible, but I don't feel a database is an option. Inputs are often put together in Excel and, in any case, should be human-readable.
|
I'd suggest you code the first tutorial only, for now, to see what things would look like in practice and then we can decide on the best way forward. |
|
I love it too! I think a wizard along the lines you're suggesting is eminently sensible. As we talked about in the meeting with Adam, I think it would be nice to have a tool for setting input parameters in general, even when not copying from an existing template -- as someone who understands MUSE fairly poorly atm I'm not sure if it makes sense for this to be a separate task? Another thing is: do we want the option of having a graphical interface for this? Or just a terminal one? A graphical interface may be more user-friendly (particularly for non-technical MUSE users), but it isn't necessarily something to do now. I guess our options are:
I'm leaning towards 3. |
|
I've had a go at this for the first tutorial. See the folder I've marked all the steps in I've created a file For now though, do you think it's worth going ahead with this for the other tutorials? |
dalonsoa
left a comment
There was a problem hiding this comment.
I've given this a try and it works really well. I think the main benefit is that it ensures consistency with the base model (the default one) so if one changes, the rest would be updated automatically.
A couple of questions I have are:
- If another tutorial builds on the model generated by, let's say, scenario 1, would they need a generate file that builds everything from scratch or can they actually thake the files for scenario 1 as the starting point.
- When running the generate tutorial script, I get something slightly different to what was already in the repo. See attached. It's merely a formatting thing - it does not affect functionality - but will freak out the version control system because that file will look as modified. It happens whenever there's a list. I guess that when the
tomlfile is saved, it follows some convention which is different to the convention it originally had. Or it might be just my computer...

|
|
||
| """ | ||
| model_name = "1-introduction" | ||
| parent_path = os.path.dirname(os.path.abspath(sys.argv[0])) |
There was a problem hiding this comment.
Are you trying to get the directory if this file? If so, maybe I would do:
from pathlib import Path
...
parent_path = Path(__file__).parentUnless there's a reason why pathlib doesn't work in here.
I think this should be possible, but we would just have to make sure the scripts are run in the correct order in a loop (and not in parallel). Should be straightforward, although I haven't looked into this yet.
Have you tried running pre-commit? I think this should get rid of the diffs, but I guess it's still quite annoying. Long run though we'll probably add these files to gitignore so it shouldn't be a problem |
|
Not run pre-commit as I had nothing to commit. But you're right - if these files are generated on the fly, there's no reason to keep them in the repo, so we could just gitignore-them. |
…ystemsModellingLab/MUSE_OS into generate_tutorial_models
|
I've gone through and written the necessary scripts for all the tutorials. Within the
I wonder if there's a better way of doing these things that doesn't require all these files to be committed? There is still a lot of work to be done on the contents of the notebook, as there are still many inconsistencies in the text. This PR is more about having a framework in place to generate the model input files programmatically, and I think I'll tackle the contents of the notebooks separately |
dalonsoa
left a comment
There was a problem hiding this comment.
This looks really good. A massive work!
I've just added a few comments and flag the need of tests for the functions in the wizard.
alexdewar
left a comment
There was a problem hiding this comment.
This looks great! Sorry it took me so long to get round to reviewing it...
I've made some comments about lists and generators, but it's more of a note for future. Only change it if you can be bothered.
src/muse/wizard.py
Outdated
| files_to_update = [ | ||
| model_path / file | ||
| for file in [ | ||
| f"technodata/{sector}/CommIn.csv", | ||
| f"technodata/{sector}/CommOut.csv", | ||
| "input/BaseYearImport.csv", | ||
| "input/BaseYearExport.csv", | ||
| "input/Projections.csv", | ||
| ] | ||
| ] + list((model_path / "technodata/preset").glob("*")) |
There was a problem hiding this comment.
You could also join these two together with itertools.chain and avoid making all those lists, but it's not particularly important.
src/muse/wizard.py
Outdated
| files_to_update = [ | ||
| model_path / file | ||
| for file in [ | ||
| f"technodata/{sector}/CommIn.csv", | ||
| f"technodata/{sector}/CommOut.csv", | ||
| f"technodata/{sector}/ExistingCapacity.csv", | ||
| f"technodata/{sector}/Technodata.csv", | ||
| ] | ||
| ] |
There was a problem hiding this comment.
If you use round brackets you can leave this as a generator expression and then Python doesn't have to create a list object:
| files_to_update = [ | |
| model_path / file | |
| for file in [ | |
| f"technodata/{sector}/CommIn.csv", | |
| f"technodata/{sector}/CommOut.csv", | |
| f"technodata/{sector}/ExistingCapacity.csv", | |
| f"technodata/{sector}/Technodata.csv", | |
| ] | |
| ] | |
| files_to_update = ( | |
| model_path / file | |
| for file in ( | |
| f"technodata/{sector}/CommIn.csv", | |
| f"technodata/{sector}/CommOut.csv", | |
| f"technodata/{sector}/ExistingCapacity.csv", | |
| f"technodata/{sector}/Technodata.csv", | |
| ) | |
| ) |
Again, just a nit, so not important
src/muse/wizard.py
Outdated
| "input/Projections.csv", | ||
| ] | ||
| ] | ||
| for file_path in sector_files + preset_files + global_files: |
Thanks @alexdewar - useful advice! |
Description
The documentation contains a number of tutorials for customising models (adding new technologies, regions, agents etc.). Each tutorial documents the processes that the user has to carry out to achieve the desired modification, and shows results for a model that has previously been build. I had a go at following along with the tutorials, starting from the default model and making the required changes step-by-step, and found that my results never matched up with the figures in the notebooks. I think the problem is two-fold:
To fix this, we need to:
This isn't so easy as the current way of generating models is to copy the default model and then manually edit a load of csv files, which I really don't want to do (this would also be a problem if the default model changes again). I think the best solution would be, rather than hardcoding the models, generate the models programatically using a
generate_model.pyfile which would be run every time the documentation is build. (Also with regressions tests to check whether the outputs of the notebooks have changed)Initial plan (April 23rd)
Towards this goal, I've started by documenting the processes that would be required to generate the models with pseudocode (still a work in progress). I don't intend to turn this into real code just yet, this is just to demonstrate the processes that have to be carried out to customise models. I've created 'functions' for:
All of these could be a good starting point for creating some kind of 'wizard' to carry out these functions for the user. The idea is that adding a new feature (agent, region etc.) could consist of copying an existing feature (which could be automated), followed by a series of manual modifications to the csv files (the steps beginning with
>>>). We could also set up the wizard so it can create a 'blank' feature (with everything set to zero), which the user would then manually modify.That said, I also think that current csv structure of the models could be significantly improved (and possibly even replaced by a relational database), in which case all of this would change, so I don't want to go too far down this route just yet without further conversation.
Update (May 17th)
I've gone through and written the necessary scripts for all the tutorials. Within the
tutorial-codefolder you'll find two new files:generate_models.pyandrun_models.py, both of which do what they say on the tin. The first one runs in a loop to generate the model input files, the second one runs models in parallel to generate the results files. I'm currently committing all model input files and results files to the repo (as was the case before), rather than running the scripts to generate the files during the documentation build. This is for two reasons:I wonder if there's a better way of doing these things that doesn't require all these files to be committed?
There is still a lot of work to be done on the contents of the notebook, as there are still many inconsistencies in the text. This PR is more about having a framework in place to generate the model input files programmatically, and I think I'll tackle the contents of the notebooks separately
Original documentation
New documentation
Fixes #99
Fixes #291
Type of change
Please add a line in the relevant section of
CHANGELOG.md to
document the change (include PR #) - note reverse order of PR #s.
Key checklist
$ python -m pytest$ python -m sphinx -b html docs docs/buildFurther checks