time intervals coerce isostrings#1326
Conversation
There was a problem hiding this comment.
I think it’s fine to do this for the built-in time intervals, but we shouldn’t do this implicit coercion if someone provides an interval implementation, including one of the D3 time interval implementations. The implicit coercion should be part of the interval implementation, and we can do that for the named time intervals we provide. (The magic of invoking the interval implementation and seeing if it returns a Date is too much, and some intervals may want to have different semantics for input coercion.)
Edit: I mean, I see that we already have “magic” detection for time thresholds in the bin transform. But I think that predates having named time intervals, and it’d probably be better if we didn’t do that, somehow.
|
I had the same hesitation but went the other way:
And, it's well defined in the README: “Time intervals are intervals that are also functions that return a Date instance when called with no arguments.” |
| ]); | ||
|
|
||
| function isoInterval(interval) { | ||
| return Object.assign((t) => interval(t), {...interval, floor: (t) => interval.floor(coerceDate(t))}); |
There was a problem hiding this comment.
This feels awkward to me. Like, isTimeInterval is the way it is because that was a quick and dirty way to test for a D3 time interval. But now we have to recreate that representation—even though we don’t need it—to satisfy the isTimeInterval check. And this is only adding coercion for one of the possible methods (floor) on the time interval, not the main function, not interval.offset and interval.range, etc. I’m not sure what fix I recommend yet.
There was a problem hiding this comment.
What about the other methods, offset, ceil, range?
There was a problem hiding this comment.
We can give them the same treatment, but in practice don't use them on user-provided data. Their input is in all cases already coerced.
(Agree it's awkward!)
6a2c5f3 to
355d869
Compare
There was a problem hiding this comment.
I’d still like to try to find a different way to do this. I think in many cases, like with binning, we’ve already done the coercion by the time the interval is applied, so this coercion is redundant. Would it be possible to instead do the coercion outside the interval, before the interval is applied, when the scale interval is a time interval?
Edit: For what it’s worth, the performance of the bin1m isn’t affected by this branch, at least. So performance doesn’t appear to be an issue. Let me think some more.
| ]); | ||
|
|
||
| function isoInterval(interval) { | ||
| return Object.assign((t) => interval(t), {...interval, floor: (t) => interval.floor(coerceDate(t))}); |
There was a problem hiding this comment.
What about the other methods, offset, ceil, range?
closes #1182 (Note: I considered doing this in d3-time, but it would mean adding d3-time-format as a dependency of d3-time.) Rebase on main Co-authored-by: Mike Bostock <mbostock@gmail.com>
2d64d00 to
0a57694
Compare
| } | ||
|
|
||
| export function isTimeInterval(t) { | ||
| return isInterval(t) && typeof t === "function" && t() instanceof Date; |
There was a problem hiding this comment.
This disagrees with how we’ve defined intervals in TypeScript, where they are objects rather than decorated functions:
Lines 19 to 22 in 820144e
And this means that you can use a {floor, offset, range?} object in Plot, which works, except when we try to do this time interval test.
Also, I don’t think we actually need this restriction, because D3 doesn’t require that time intervals are functions, either. It instead checks for the range method (and then blindly invokes the floor and offset methods):
https://github.com/d3/d3-time/blob/9e8dc940f38f78d7588aad68a54a25b1f0c2d97b/src/ticks.js#L38
https://github.com/d3/d3-scale/blob/2b7db622b1a224d9ea19ff15c4cc8cbb3b25f4a4/src/time.js#L58
I added that duck testing in #572, but I think it would be better to replace that duck testing with something consistent with our new TypeScript declarations and with D3. Probably we should invoke the floor method and see if it returns a Date? But that’s a little messy too, since that method requires an argument, and it’s unclear whether we should pass it a Date or a number… All D3 time intervals support coercion (as does Plot’s internal number interval), but should we expect users to implement coercion in their own intervals? I mean, isn’t the point of this PR is that they don’t? 🤔
|
I’ve surveyed the code base to see where we might need coercion for intervals. The (implicit) interval transform uses intervals in maybeIntervalK and maybeIntervalMidK. Both of these would benefit from an implicitly-coercing interval. (Though it would also be possible to coerce all the values prior to invoking the interval.) plot/src/transforms/interval.js Line 31 in 820144e plot/src/transforms/interval.js Line 50 in 820144e A scale transform can be specified as an interval as described in #1182. This happens in applyScaleTransform when the interval.floor is promoted to a transform. This would benefit from an implicitly-coercing interval, too. Lines 365 to 367 in 820144e The axisMark can also use the scale interval to compute ticks. This would not benefit from an implicitly-coercing interval: the scale should already have coerced values at this point, and we wouldn’t want the interval to use different coercion than the scale. Lines 520 to 521 in 820144e Ordinal scales use intervals to compute the domain. This would not benefit from an implicitly-coercing interval: we would need to do the coercion earlier on the scale’s associated channel values, so that the extent is computed correctly and so that the scale’s imputed domain matches the channel values. Lines 100 to 101 in 820144e The contour mark can use an interval to compute thresholds. The majority of the time the contour mark will be used with numeric data, but it is theoretically possible to compute contours of arbitrary values, including temporal data. So, this might benefit from implicit coercion. However, it again would not work to perform that coercion within the interval; we would need to coerce all the contour values, not just coerce when computing ticks. Line 191 in 820144e The bin mark can likewise use an interval to compute thresholds. It already performs coercion and would not benefit from additional implicit coercion inside the interval. Line 232 in 820144e To summarize the effect of implicitly-coercing intervals:
My recommendation therefore is, instead of implicitly-coercing time intervals, that we coerce values as appropriate prior to using the interval similar to what the bin transform already does. In the case of ordinal scales, we’ll also need to coerce to numbers, not just dates, so that we compute the extent correctly. |
|
I have revised this PR's description with the TODO. |
| if (V1 !== undefined && data === D1) return V1; // memoize | ||
| return (V1 = map(valueof((D1 = data), value), (v) => interval.floor(v))); | ||
| let V = valueof((D1 = data), value); | ||
| if (isTimeInterval(interval)) V = map(V, coerceDate, Float64Array); |
There was a problem hiding this comment.
| if (isTimeInterval(interval)) V = map(V, coerceDate, Float64Array); | |
| if (isTimeInterval(interval)) V = coerceDates(V); |
I think we want the slower coerceDates here. This code using Float64Array is faster, but we can only use it in the bin transform because we know it’s not exposed outside the bin transform. I don’t think that’s true here; we want to pass the interval Date instances, not numbers.
Also, given that we can’t use a typed array here, it will probably be faster to fold the coerceDate call into the interval.floor(v) call before to avoid one additional array copy in map.
| const i = maybeInterval(interval, type); | ||
| if (isTimeInterval(i)) return (channel.value = channel.value.map((d) => i.floor(coerceDate(d)))), undefined; | ||
| transform = i?.floor; | ||
| } |
There was a problem hiding this comment.
I think it would be cleaner to refactor this into a helper like so:
function maybeIntervalTransform(interval, type) {
interval = maybeInterval(interval, type);
return isTimeInterval(interval) ? (d) => interval.floor(coerceDate(d)) : interval?.floor;
}And then replace the default like so:
transform = percent ? (x) => x * 100 : maybeIntervalTransform(interval, type)In other words, I think it’s preferable to consolidate the code paths so that the interval is still promoted to a transform. And also we’d want to use map here instead of channel.value.map because channel.value could be a typed array.
There was a problem hiding this comment.
Also do we want to coerce to numbers for number intervals? Should we further assume that intervals only apply to numbers and dates and not other types? 🤔
The discussion below ends up with the recommendation that if we want to do this, we should coerce isostrings to dates in the channels, rather than wrap the interval object.
todo/done:
closes #1182
(Note: I considered doing this in d3-time, but it would mean adding d3-time-format as a dependency of d3-time.)