Skip to content

Simplify the use of timeslices#519

Merged
tsmbland merged 83 commits intov1.3from
convert_timeslice2
Nov 13, 2024
Merged

Simplify the use of timeslices#519
tsmbland merged 83 commits intov1.3from
convert_timeslice2

Conversation

@tsmbland
Copy link
Collaborator

@tsmbland tsmbland commented Oct 10, 2024

This is a pretty seismic change to the way that timeslices are dealt with, motivated by the broadcasting problems fixed by #518 and #534, and a desire for similar errors to never happen again. I also generally feel that the use of timeslices is wildly over-complicated, error prone and unreadable, so I wanted to leave things in a better place.

Main changes

  • Get rid of the QuantityType class, which, when used with convert_timeslice, controlled whether to broadcast or distribute a quantity over the timeslices. I think it's more readable to have two separate functions, which I've called broadcast_timeslice and distribute_timeslice. (And, as mentioned in convert_timeslice function applying the wrong operation #516, the documentation around QuantityType was wrong anyway)

  • Use the global timeslicing scheme everywhere. The old convert_timeslice function would take a mandatory timeslices argument, which was basically another array with the desired timeslicing scheme that you want your array to match. The reason to do this, rather than just relying on one global timeslicing scheme, was so that things could be flexible enough for different sectors to have different timeslicing schemes (you can see how this would have worked in the deleted sections of the documentation). I'm very aware that this is a potentially useful feature, and I'll discuss this more this below, but for now I've decided to drop this and just use the global timeslicing scheme (represented by the TIMESLICES object) throughout. This means I no longer have to pass any additional arguments to the new broadcast_timeslice and distribute_timeslice functions, which makes things a lot tidier (some objects were being passed around solely so their timeslicing coordinates could be extracted, so I've been able to simplify a few things where this is no longer required), and gives less room for inconsistencies. Where all sector have the same timeslicing scheme (every model I've ever worked with), this has no outcome on the results and the user won't notice that anything has changed.

  • This also drops the timeslice aggregation feature (see deleted part of the documentation), but the way I see it this was just a way to shorten your input files (e.g. specifying all-day for the utilization factor rather than a value for every timeslice). Maybe this will annoy some people (although I haven't seen anyone using this feature), but this doesn't actually remove any functionality, and if we wanted to allow this sort of thing I'd be much more comfortable implementing it at the input layer level.

  • Because I'm using one timeslicing scheme throughout, I've been able to remove a lot of (horrible and unreadable) code designed to switch arrays between different timeslicing schemes.

  • Use an xarray patch to prevent automatic broadcasting over the timeslice dimension. This means that you have to explicitly call broadcast_timeslice or distribute_timeslice when you want to perform an operation combining a timesliced object with a non-timesliced object. I've managed to get this working for the tests as well. Of course, it's still possible for developers to pick the wrong function (e.g. use broadcast_timeslice where distribute_timeslce should be used), but at least we are forced to make that decision rather than allowing xarray to automatically broadcast. NOTE: for some reason the patch only applies when I'm running in debug mode - but it's still very useful

Other small changes

Changes to the timeslice module:

  • Move DEFAULT_TIMESLICE_DESCRIPTION over to the tests, as this is what it's used for, and I wouldn't want this to be accidentally used if a user has just forgotten to specify timeslices in their settings file

  • Removed the represent_hours function as this essentially just returns TIMESLICE

Changes to the readers module:

  • Remove the read_csv_timeslices function. This allowed you to specify timeslices in a csv file, but nobody's doing this

  • Remove read_ts_multiindex and read_timeslices from readers.toml - no longer required

  • Remove check_time_slices. This was doing an important job in setting up the timeslices module (i.e. setting TIMESLICE), but it's better to have this in the read_settings function

Other changes:

  • Simplify the use of timeslice_op. There was an undocumented parameter that the user could specify in the settings file to control whether this function would do a max or a sum operation (possibly a legacy thing?). I've just hardcoded this to do the max operation which was the default and never overwritten (created new function in investments.py)

  • Rewrite capacity_to_service_demand in a more readable way


Ok, in terms of having different timeslicing schemes for different sectors - this is probably quite useful, e.g. if you wanted to your oil sector to balance demands at the month level rather than at the hour level, you could have used a less granular timeslicing scheme for the oil sector (although I haven't tested if this actually works). My plans for this are more like what's been proposed to MUSE2. Rather than specifying a timeslicing scheme for the sector, you'd specify a timeslice_level parameter (which could be "annual", "month", "day", "hour"). This would then be propagated to the Sector object and any Subsector and Agent objects that it owns, where it could then be passed as an argument to every call of broadcast_timeslice and distribute_timeslice to make sure that the resulting array has the desired level of granularity. #550 gives an idea of how this would work, although I still need to code up the maths in the two functions, and deal with the interface between sectors (this may end up restoring some of the deleted code in this PR related to transforms).

For now I'm only proposing to merge this PR into the v1.3 branch, which is still a long way off a final release, so I'd have time to finish #550 (or at least try) before a final release is made. Or happy to wait until #550 is finished before merging this.


Closes #516

@tsmbland tsmbland changed the base branch from develop to refactor October 10, 2024 13:03
tsmbland and others added 11 commits October 25, 2024 16:16
* xarray patch to prevent automatic broadcasting

* Fix most remaining broadcasting bugs

* Fix some tests

* Fix more tests

* Simplify dlc constraint

* More timeslice broadcasting

* Fix incorrect uses of distribute_timeslice

* Fix bug in _inner_split

* Remove unnecessary drop_timeslice operations

* Fix correlation model

* Fix a couple of tests

* Restore drop_timeslice

* Restore more drop_timeslice

* Fix demand_matching tests

* Fix correlation model

* Consistent timeslice dimension in objectives

* Revert change to capacity_in_use

* Fix objective tests

* Fix more tests

* Fix another test

* Fix final test (hopefully)
Base automatically changed from legacy to v1.3 November 5, 2024 17:02
@tsmbland tsmbland marked this pull request as ready for review November 7, 2024 11:49
@tsmbland tsmbland requested a review from dalonsoa November 7, 2024 11:52
Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked every single line of code, but the explanation of why and how that you give is totally sensible and the code looks way cleaner and less confusing in relation to timeslices. A massive work!

About the caveat, you have already spotted it and suggested a way forward. I think it makes sense to implement that on top of a clean, clear code, that trying to fix the existing messy one.

My only comment is that consistently using a global TIMESLICE object breaks - well, it was already broken, anyway - the roughly functional approach that MUSE was following, with the outputs of a function only depending on the inputs, and therefore explicit in the dependencies. It could be argued that we could use a global variable for the technodata rather than passing it around, as well, which is also constant throughout the simulation, and possibly other datasets. But that sort of global objects not only are generally discouraged, but also can cause a lot of confusion because it becomes unclear where information is coming from in a function. The exception being if such global objects are a database, which is something that has been/is being discussed, as well.

In summary, I'm happy with the changes, but passing a timeslice slice object around - but only the timeslice array, not other arrays for the only purpose of using their timeslices - might not be such a bad idea to keep the code functional as much as possible. But this can be better done over a clean codebase, as you are having now, possibly when/if implementing the option of a per-sector timeslice, as discussed above.

@tsmbland
Copy link
Collaborator Author

Thanks! Yeah I totally get your point about using a functional approach. I was less concerned about doing this because MUSE already deviates from this in many places, but that doesn't mean we shouldn't be doing things right when we can.

In fact, the global TIMESLICE object was always being used to begin with. The objects being passed around to reference their timeslice contained information about which timeslicing arrangement was required in the output, but information about timeslice length (which is required to split the quantity over the timeslices) has always come from TIMESLICE and not explicitly passed to the functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

convert_timeslice function applying the wrong operation

2 participants

Comments