-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
excaliburNeeds discussion by the excalibur teamNeeds discussion by the excalibur team
Milestone
Description
In working through the implications of implementing means in chunks, it is notable that once missing data is in play, we need to return two numbers from the reduce_chunk method: the sum, and the count, because means over chunks will be needed to be weighted by the actual number of values being meaned.
There are a number of ways we could implement this:
- Always return
(X, N), whereXis the expected operation, andNthe number of values contributing - Only return
(X, N)when required (e.g. for means) otherwise return(X,None)or(X,) - Return
X, except when it needs to be(X,N) - Something else.
The something else option could be slightly more interesting: do we think it's a smart idea to say we could chain a series of methods and expect a series of results, in a lightweight sort of caching?
Obvious use cases would be:
- mean = sum, count
- range = min, max
- sqmean = sum(squares), sum, count
This could be facilitated by handing not just "a method" but a list of 1.. many methods, and expect back a list of 1..many results.
Metadata
Metadata
Assignees
Labels
excaliburNeeds discussion by the excalibur teamNeeds discussion by the excalibur team