Speed up timestamptz decoding #338

cbandy · 2015-02-10T07:55:53Z

Before:

BenchmarkDecodeTimestamptz       1000000              1196 ns/op              64 B/op          2 allocs/op
BenchmarkDecodeTimestamptzMultiThread    1000000              1223 ns/op              64 B/op          2 allocs/op

After:

BenchmarkDecodeTimestamptz       3000000               473 ns/op              32 B/op          1 allocs/op
BenchmarkDecodeTimestamptzMultiThread    3000000               481 ns/op              32 B/op          1 allocs/op

This expects the input to be correctly formatted from the backend and asserts nothing. I'm not sure how I feel about that... Thoughts?

cbandy · 2015-02-11T05:18:19Z

I've included assertions as they are certainly necessary. This implementation runs in half the time with half as many allocations.

Before this PR:

BenchmarkDecodeTimestamptz       1000000              1202 ns/op              64 B/op          2 allocs/op
BenchmarkDecodeTimestamptzMultiThread    1000000              1191 ns/op              64 B/op          2 allocs/op

Without assertions:

BenchmarkDecodeTimestamptz       3000000               466 ns/op              32 B/op          1 allocs/op
BenchmarkDecodeTimestamptzMultiThread    3000000               484 ns/op              32 B/op          1 allocs/op

After this PR:

BenchmarkDecodeTimestamptz       3000000               519 ns/op              32 B/op          1 allocs/op
BenchmarkDecodeTimestamptzMultiThread    3000000               527 ns/op              32 B/op          1 allocs/op

msakrejda · 2015-02-11T09:42:53Z

This looks good. It's unfortunate that this would lead to two very similar code paths for parsing time, but maybe that's worth it. My original code certainly wasn't optimized for performance. I'll merge this if there are no objections.

(oh, and I checked Travis and it's the same tip / 9.4 issue as described in #334)

johto · 2015-02-11T10:01:36Z

How much can we speed up parseTs() if we avoid the second allocation? Where is it coming from in the first place? It's not obvious from just looking at the code.

I'm a bit worried about having this much code to maintain just for parsing timestamps.

cbandy · 2015-02-11T12:26:36Z

How much can we speed up parseTs()

Since parseTs() is used only for timestamp and date, code there around offset/location can probably be jettisoned.

the second allocation? Where is it coming from in the first place?

Perhaps the conversion to string? This version operates on the byte slice. We also called out to strconv.Atoi, which uses the generic ParseUint.

having this much code to maintain just for parsing timestamps

In general, or because I added a function instead of touching parseTs? I tried using time.Parse more heavily, but it was slower than parseTs.

two very similar code paths for parsing time

Parsing other time-ish types is being debated in #329. I focused on timestamptz since it seemed to be immune to the debate.

cbandy · 2015-02-15T22:56:59Z

I've added one more commit that I think makes this function easier to read.

HEAD is now at 08ad791... Assert expected bytes as they are processed
BenchmarkDecodeTimestamptz       3000000               514 ns/op              32 B/op          1 allocs/op
BenchmarkDecodeTimestamptzMultiThread    3000000               528 ns/op              32 B/op          1 allocs/op

HEAD is now at 013da70... Reduce some duplication when decoding timestamptz
BenchmarkDecodeTimestamptz       3000000               492 ns/op              32 B/op          1 allocs/op
BenchmarkDecodeTimestamptzMultiThread    3000000               521 ns/op              32 B/op          1 allocs/op

johto · 2015-02-17T21:36:38Z

having this much code to maintain just for parsing timestamps

In general, or because I added a function instead of touching parseTs? I tried using time.Parse more heavily, but it was slower than parseTs.

I don't mind not using time.Parse, but I wouldn't want two distinct hand-written routines for parsing time-related data. Naively thinking, we'd have one function which:

Parses a date in ISO format. If the data type is "date", the end of string must be nearby as well.
Parses a time. If the data type is "without time zone", the end of string must be there.
Parses the UTC offset information. This must be the end of string.

If we can make that function fast while supporting all three types, things ought to be pretty good.

cbandy · 2015-02-17T23:36:38Z

I wouldn't want two distinct hand-written routines for parsing time-related data.

I agree. This PR should probably wait for a resolution to #329.

cbandy added 2 commits February 10, 2015 07:40

Function to decode timestamptz to time.Time

fbb87a3

Assert expected bytes as they are processed

08ad791

Reduce some duplication when decoding timestamptz

013da70

arp242 added enhancement needs-work labels Jan 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up timestamptz decoding #338

Speed up timestamptz decoding #338

Uh oh!

cbandy commented Feb 10, 2015

Uh oh!

cbandy commented Feb 11, 2015

Uh oh!

msakrejda commented Feb 11, 2015

Uh oh!

johto commented Feb 11, 2015

Uh oh!

cbandy commented Feb 11, 2015

Uh oh!

cbandy commented Feb 15, 2015

Uh oh!

johto commented Feb 17, 2015

Uh oh!

cbandy commented Feb 17, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Speed up timestamptz decoding #338

Are you sure you want to change the base?

Speed up timestamptz decoding #338

Uh oh!

Conversation

cbandy commented Feb 10, 2015

Uh oh!

cbandy commented Feb 11, 2015

Uh oh!

msakrejda commented Feb 11, 2015

Uh oh!

johto commented Feb 11, 2015

Uh oh!

cbandy commented Feb 11, 2015

Uh oh!

cbandy commented Feb 15, 2015

Uh oh!

johto commented Feb 17, 2015

Uh oh!

cbandy commented Feb 17, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants