Skip to content
This repository was archived by the owner on Aug 19, 2019. It is now read-only.

Conversation

@igorpeshansky
Copy link
Contributor

This addresses cases when fractional seconds aren't present, or when fractional seconds don't have exactly 9 digits.

@bmoyles0117
Copy link
Contributor

What do you think about failing over to a secondary parser instead? If we can parse with nanoseconds, great, otherwise failover to without nanoseconds, or vice versa. It'd be nice to stick with the strptime utility instead of custom logic.

@igorpeshansky
Copy link
Contributor Author

There's no secondary parser — strptime does not support fractional seconds.
We could exclude seconds from the strptime format and then parse seconds and nanoseconds together as a double using strtod, which would avoid the whole length computation, but then we'd have to worry about funky formats like hex and floating point values like "+inf", so the tradeoff is not obvious. I could try it and see, I suppose...

@igorpeshansky igorpeshansky force-pushed the igorp-fix-timestamp-parsing branch 4 times, most recently from 4fcfa4c to ac5966e Compare March 4, 2018 05:48
@igorpeshansky
Copy link
Contributor Author

I've tried it, and it would be fairly hard to detect cases like "2018-03-03T01:23:00045.6789" (i.e., 5 digits for the seconds) with the new approach (see the failing test). I'll leave the commit around in case you want to look at it, but I would lean towards removing it and sticking with the original implementation for now.

Copy link
Contributor

@supriyagarg supriyagarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original implementation LGTM. Minor comment.

src/time.cc Outdated
if (length > 9) {
// More digits than can be stored as nanoseconds.
// TODO: Should this round (std::lround)?
ns /= Exp10(length - 9);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round the value, rather than take floor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require changing the test as well. Which semantics do we want?

@igorpeshansky igorpeshansky force-pushed the igorp-fix-timestamp-parsing branch from 3a267c6 to ac3797d Compare March 6, 2018 04:34
Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out it wasn't that hard to catch the cases of more than 2 digits in seconds. So the new implementation is viable as well. Would really like your opinion on which one is better/more readable. PTAL.

src/time.cc Outdated
if (length > 9) {
// More digits than can be stored as nanoseconds.
// TODO: Should this round (std::lround)?
ns /= Exp10(length - 9);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require changing the test as well. Which semantics do we want?

src/time.cc Outdated
return std::chrono::system_clock::time_point();
}
tm.tm_sec = sec_i;
long ns = std::lround((seconds - sec_i) * 10000000000) / 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new implementation looks good - much simpler.
Please add a test for the rounding logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually truncation logic. There just isn't an std::ltrunc, so I use std::lround and then truncate the last int digit. Added a comment.

Copy link
Contributor

@bmoyles0117 bmoyles0117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one comment / concern that may not be valid.

src/time.cc Outdated
if (zone <= end + 1 || *zone != 'Z' || *(zone+1) != '\0') {
// TODO
double seconds = std::strtod(end, &zone);
if (sec_i < 0 || sec_i != static_cast<long>(seconds)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor note, I couldn't find a correlating unit test to this conditional statement. Every time there was a strange value in the place of the seconds, it was longer than 2 characters, that would have been caught by the previous condition. Might be overlooking the unit test that covers this case, just wanted to make a note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. A negative value would be caught by the earlier isdigit check, so the first test is useless.
The second test was intended to catch a double in the scientific notation. However, adding e+0 to an integer produces a double that is equal to that integer. I've added a test, but if fails with the new implementation. Unfortunately, there are way too many valid formats that strtod accepts, so it seems like I'm blacklisting them one-by-one, rather than whitelisting the one format I know I'll want.
The new test passes with the original implementation, so let's just go with that one and clean it up later. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Managed to get it to work. Back to the question of which one's more readable.

@igorpeshansky igorpeshansky force-pushed the igorp-fix-timestamp-parsing branch from 288d579 to fa90d5a Compare March 7, 2018 23:31
Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews. I'm still struggling with the new implementation. I'll take one more stab at it later today, but we can always fall back to the original one.

src/time.cc Outdated
if (zone <= end + 1 || *zone != 'Z' || *(zone+1) != '\0') {
// TODO
double seconds = std::strtod(end, &zone);
if (sec_i < 0 || sec_i != static_cast<long>(seconds)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. A negative value would be caught by the earlier isdigit check, so the first test is useless.
The second test was intended to catch a double in the scientific notation. However, adding e+0 to an integer produces a double that is equal to that integer. I've added a test, but if fails with the new implementation. Unfortunately, there are way too many valid formats that strtod accepts, so it seems like I'm blacklisting them one-by-one, rather than whitelisting the one format I know I'll want.
The new test passes with the original implementation, so let's just go with that one and clean it up later. WDYT?

src/time.cc Outdated
return std::chrono::system_clock::time_point();
}
tm.tm_sec = sec_i;
long ns = std::lround((seconds - sec_i) * 10000000000) / 10;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually truncation logic. There just isn't an std::ltrunc, so I use std::lround and then truncate the last int digit. Added a comment.

Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new implementation now passes the tests again. Let's figure out which one's more readable and use that as the baseline -- we can tweak it going forward.

src/time.cc Outdated
if (zone <= end + 1 || *zone != 'Z' || *(zone+1) != '\0') {
// TODO
double seconds = std::strtod(end, &zone);
if (sec_i < 0 || sec_i != static_cast<long>(seconds)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Managed to get it to work. Back to the question of which one's more readable.

@igorpeshansky
Copy link
Contributor Author

I've updated the new implementation to actually use the full time spec as well -- my one concern with the earlier implementation was that we cut off the seconds and parse them separately, which is now fixed. @bmoyles0117 @supriyagarg PTAL.

Copy link
Contributor

@supriyagarg supriyagarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this looks much better! The commit title was a bit confusing, but using the full time spec sounds good.

(void)std::strtol(end + 1, &zone, 10);
if (zone <= end + 1) {
// TODO: Missing nanoseconds.
return std::chrono::system_clock::time_point();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not safe to assume 0 for nanoseconds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on whether we treat a trailing decimal point with no digits after as valid input or an error. As I read the spec, a decimal point must be followed by at least one digit (search for time-fraction).

// TODO: Internal error.
return std::chrono::system_clock::time_point();
}
static_assert(sizeof(long) == 8, "long is too small");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example with a 32-bit compiler. I'd rather fail compilation if someone ever tries to build for Amazon's 32-bit AMI or Windows than silently overflow.

Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL.

(void)std::strtol(end + 1, &zone, 10);
if (zone <= end + 1) {
// TODO: Missing nanoseconds.
return std::chrono::system_clock::time_point();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on whether we treat a trailing decimal point with no digits after as valid input or an error. As I read the spec, a decimal point must be followed by at least one digit (search for time-fraction).

// TODO: Internal error.
return std::chrono::system_clock::time_point();
}
static_assert(sizeof(long) == 8, "long is too small");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example with a 32-bit compiler. I'd rather fail compilation if someone ever tries to build for Amazon's 32-bit AMI or Windows than silently overflow.

Copy link
Contributor

@bmoyles0117 bmoyles0117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@igorpeshansky igorpeshansky force-pushed the igorp-fix-timestamp-parsing branch from bd8316f to e70bb6c Compare March 13, 2018 15:04
@igorpeshansky
Copy link
Contributor Author

Rebased off master. Merging.

@igorpeshansky igorpeshansky merged commit 1e29f74 into master Mar 13, 2018
@igorpeshansky igorpeshansky deleted the igorp-fix-timestamp-parsing branch March 13, 2018 15:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants