Conversation
* xarray patch to prevent automatic broadcasting * Fix most remaining broadcasting bugs * Fix some tests * Fix more tests * Simplify dlc constraint * More timeslice broadcasting * Fix incorrect uses of distribute_timeslice * Fix bug in _inner_split * Remove unnecessary drop_timeslice operations * Fix correlation model * Fix a couple of tests * Restore drop_timeslice * Restore more drop_timeslice * Fix demand_matching tests * Fix correlation model * Consistent timeslice dimension in objectives * Revert change to capacity_in_use * Fix objective tests * Fix more tests * Fix another test * Fix final test (hopefully)
dalonsoa
left a comment
There was a problem hiding this comment.
I haven't checked every single line of code, but the explanation of why and how that you give is totally sensible and the code looks way cleaner and less confusing in relation to timeslices. A massive work!
About the caveat, you have already spotted it and suggested a way forward. I think it makes sense to implement that on top of a clean, clear code, that trying to fix the existing messy one.
My only comment is that consistently using a global TIMESLICE object breaks - well, it was already broken, anyway - the roughly functional approach that MUSE was following, with the outputs of a function only depending on the inputs, and therefore explicit in the dependencies. It could be argued that we could use a global variable for the technodata rather than passing it around, as well, which is also constant throughout the simulation, and possibly other datasets. But that sort of global objects not only are generally discouraged, but also can cause a lot of confusion because it becomes unclear where information is coming from in a function. The exception being if such global objects are a database, which is something that has been/is being discussed, as well.
In summary, I'm happy with the changes, but passing a timeslice slice object around - but only the timeslice array, not other arrays for the only purpose of using their timeslices - might not be such a bad idea to keep the code functional as much as possible. But this can be better done over a clean codebase, as you are having now, possibly when/if implementing the option of a per-sector timeslice, as discussed above.
|
Thanks! Yeah I totally get your point about using a functional approach. I was less concerned about doing this because MUSE already deviates from this in many places, but that doesn't mean we shouldn't be doing things right when we can. In fact, the global |
This is a pretty seismic change to the way that timeslices are dealt with, motivated by the broadcasting problems fixed by #518 and #534, and a desire for similar errors to never happen again. I also generally feel that the use of timeslices is wildly over-complicated, error prone and unreadable, so I wanted to leave things in a better place.
Main changes
Get rid of the
QuantityTypeclass, which, when used withconvert_timeslice, controlled whether to broadcast or distribute a quantity over the timeslices. I think it's more readable to have two separate functions, which I've calledbroadcast_timesliceanddistribute_timeslice. (And, as mentioned inconvert_timeslicefunction applying the wrong operation #516, the documentation aroundQuantityTypewas wrong anyway)Use the global timeslicing scheme everywhere. The old
convert_timeslicefunction would take a mandatorytimeslicesargument, which was basically another array with the desired timeslicing scheme that you want your array to match. The reason to do this, rather than just relying on one global timeslicing scheme, was so that things could be flexible enough for different sectors to have different timeslicing schemes (you can see how this would have worked in the deleted sections of the documentation). I'm very aware that this is a potentially useful feature, and I'll discuss this more this below, but for now I've decided to drop this and just use the global timeslicing scheme (represented by theTIMESLICESobject) throughout. This means I no longer have to pass any additional arguments to the newbroadcast_timesliceanddistribute_timeslicefunctions, which makes things a lot tidier (some objects were being passed around solely so their timeslicing coordinates could be extracted, so I've been able to simplify a few things where this is no longer required), and gives less room for inconsistencies. Where all sector have the same timeslicing scheme (every model I've ever worked with), this has no outcome on the results and the user won't notice that anything has changed.This also drops the timeslice aggregation feature (see deleted part of the documentation), but the way I see it this was just a way to shorten your input files (e.g. specifying
all-dayfor the utilization factor rather than a value for every timeslice). Maybe this will annoy some people (although I haven't seen anyone using this feature), but this doesn't actually remove any functionality, and if we wanted to allow this sort of thing I'd be much more comfortable implementing it at the input layer level.Because I'm using one timeslicing scheme throughout, I've been able to remove a lot of (horrible and unreadable) code designed to switch arrays between different timeslicing schemes.
Use an xarray patch to prevent automatic broadcasting over the timeslice dimension. This means that you have to explicitly call
broadcast_timesliceordistribute_timeslicewhen you want to perform an operation combining a timesliced object with a non-timesliced object. I've managed to get this working for the tests as well. Of course, it's still possible for developers to pick the wrong function (e.g. usebroadcast_timeslicewheredistribute_timeslceshould be used), but at least we are forced to make that decision rather than allowing xarray to automatically broadcast. NOTE: for some reason the patch only applies when I'm running in debug mode - but it's still very usefulOther small changes
Changes to the
timeslicemodule:Move
DEFAULT_TIMESLICE_DESCRIPTIONover to the tests, as this is what it's used for, and I wouldn't want this to be accidentally used if a user has just forgotten to specify timeslices in their settings fileRemoved the
represent_hoursfunction as this essentially just returnsTIMESLICEChanges to the
readersmodule:Remove the
read_csv_timeslicesfunction. This allowed you to specify timeslices in a csv file, but nobody's doing thisRemove
read_ts_multiindexandread_timeslicesfromreaders.toml- no longer requiredRemove
check_time_slices. This was doing an important job in setting up the timeslices module (i.e. setting TIMESLICE), but it's better to have this in theread_settingsfunctionOther changes:
Simplify the use of
timeslice_op. There was an undocumented parameter that the user could specify in the settings file to control whether this function would do a max or a sum operation (possibly a legacy thing?). I've just hardcoded this to do the max operation which was the default and never overwritten (created new function ininvestments.py)Rewrite
capacity_to_service_demandin a more readable wayOk, in terms of having different timeslicing schemes for different sectors - this is probably quite useful, e.g. if you wanted to your oil sector to balance demands at the month level rather than at the hour level, you could have used a less granular timeslicing scheme for the oil sector (although I haven't tested if this actually works). My plans for this are more like what's been proposed to MUSE2. Rather than specifying a timeslicing scheme for the sector, you'd specify a
timeslice_levelparameter (which could be "annual", "month", "day", "hour"). This would then be propagated to theSectorobject and anySubsectorandAgentobjects that it owns, where it could then be passed as an argument to every call ofbroadcast_timesliceanddistribute_timesliceto make sure that the resulting array has the desired level of granularity. #550 gives an idea of how this would work, although I still need to code up the maths in the two functions, and deal with the interface between sectors (this may end up restoring some of the deleted code in this PR related to transforms).For now I'm only proposing to merge this PR into the v1.3 branch, which is still a long way off a final release, so I'd have time to finish #550 (or at least try) before a final release is made. Or happy to wait until #550 is finished before merging this.
Closes #516