Skip to content

Conversation

@Manishearth
Copy link
Contributor

Fixes #382

Open to feedback on the API. An alternate way to structure this is to always normalize. That would certainly be cleaner: thoughts?

@Manishearth Manishearth requested a review from nekevss July 17, 2025 23:28
@Manishearth Manishearth force-pushed the normalized-idents branch 2 times, most recently from 38e1c66 to 6ab8a12 Compare July 18, 2025 00:18

let timezone = partial.timezone.unwrap_or_default();

let timezone = timezone.normalize_with_provider(provider)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should instead be in temporal_capi's PartialZonedDateTime ctor.

I think things would be cleaner if TimeZone could only ever be constructed with normalized timezones; however this becomes a much more invasive change. I can still attempt it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any point where the time zone isn't expected to be normalized by the spec. And lack of normalization impacts things like equality.

Copy link
Member

@nekevss nekevss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an okay middleground for now, but I think the change to TimeZoneProvider should replace the currently existing method.

src/provider.rs Outdated
pub trait TimeZoneProvider {
fn check_identifier(&self, identifier: &str) -> bool;

fn normalize_identifier(&self, ident: &'_ str) -> TemporalResult<Cow<'_, str>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are adding this normalization, I think I'd prefer to remove check_identifier and always normalize. If I recall correctly, the specification essentially expects the normalized IANA identifier after lookup, so we should do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try. The main problem is that this gets pretty pervasive pretty quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it that pervasive? I was thinking that normalize_identifier should only really be replacing GetAvailableNamedTimeZoneIdentifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between that and every callsite of TimeZone::try_from... not counting offset callsites?

I'll have to experiment with this. Part of the problem is that TimeZones are constructed on the fly in a lot of different places.

@jedel1043
Copy link
Member

jedel1043 commented Jul 18, 2025

I think I've mentioned this in the TimezoneProvider issue, but it also applies here. Would it be that bad if we always used compiled data to normalize? Seems like it could simplify this a lot.

@Manishearth
Copy link
Contributor Author

Data skew is a problem I think. It's not that big a deal to pass providers down to this code, I don't mind just doing that.

@nekevss
Copy link
Member

nekevss commented Jul 18, 2025

Would it be that bad if we always used compiled data to normalize?

I thought about doing this initially, but part of the issue is that this is not just normalizing the identifier. It's also checking that the identifier is in the underlying data. They need to be coupled so that they can't desync.

Once we know the identifier exists, then the other methods / data fetching is infallible, because it must exist.

@jedel1043
Copy link
Member

I thought about doing this initially, but part of the issue is that this is not just normalizing the identifier. It's also checking that the identifier is in the underlying data. They need to be coupled so that they can't desync.

Once we know the identifier exists, then the other methods / data fetching is infallible, because it must exist.

Got it. Though, this IMO is also nicely solved by the Rc<TimeZoneData> idea, because you would tie both the normalized identifier and its corresponding data into one immutable struct, making it truly infallible.

@Manishearth
Copy link
Contributor Author

Ideally we can avoid all those allocations and just use an iana id or something. But yes.

@Manishearth
Copy link
Contributor Author

I think I've updated this to unconditionally normalize everywhere

@Manishearth Manishearth force-pushed the normalized-idents branch 3 times, most recently from e2f615c to da90d6c Compare July 18, 2025 20:28
@Manishearth Manishearth merged commit f4b8484 into boa-dev:main Jul 20, 2025
8 checks passed
@Manishearth Manishearth deleted the normalized-idents branch July 20, 2025 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TimeZone parsing doesn't normalize identifiers

3 participants