Skip to content

MINOR: Add Timer to simplify timeout bookkeeping#5087

Merged
hachikuji merged 6 commits intoapache:trunkfrom
hachikuji:minor-add-common-timer
Aug 4, 2018
Merged

MINOR: Add Timer to simplify timeout bookkeeping#5087
hachikuji merged 6 commits intoapache:trunkfrom
hachikuji:minor-add-common-timer

Conversation

@hachikuji
Copy link
Copy Markdown
Contributor

This is an attempt to find a better pattern for blocking methods with a timeout. We currently do a lot of bookkeeping for timeouts which is both error-prone and distracting.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@hachikuji hachikuji force-pushed the minor-add-common-timer branch 2 times, most recently from d9a68fc to 627d418 Compare May 28, 2018 07:04
@hachikuji hachikuji changed the title MINOR: [WIP] Add Timer to simplify timeout bookkeeping MINOR: Add Timer to simplify timeout bookkeeping May 28, 2018
@guozhangwang
Copy link
Copy Markdown
Contributor

retest this please

@hachikuji
Copy link
Copy Markdown
Contributor Author

One of the tests is failing and causing the controller-event-thread to be left around. I will investigate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always verifying that your reviewers are paying attention. ;)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just keeping you on your toes 😉.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. This should still be volatile.

@hachikuji
Copy link
Copy Markdown
Contributor Author

retest this please

@hachikuji hachikuji force-pushed the minor-add-common-timer branch 2 times, most recently from 643f378 to 6688004 Compare May 31, 2018 07:53
@hachikuji
Copy link
Copy Markdown
Contributor Author

retest this please

@lindong28
Copy link
Copy Markdown
Member

@hachikuji This is a very nice improvement! If you need, I can help give the first around of review once it is ready.

Copy link
Copy Markdown
Member

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. A few initial comments/questions before doing a more thorough review.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add a default method for hiResClockMs as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is this method worth it? It seems like it saves us new and a comma.

time.timer(5000)
new Timer(time, 5000)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack on the first point. For the second, I had initially expected it would be useful to be able to override the method in testing to provide an alternative implementation. I haven't needed it yet, but I'm slightly inclined to keep the option open and avoid the direct dependence on the Timer implementation. I also thought it might make Timer a bit more discoverable since most are familiar with the Time interface.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add javadoc to both of these methods?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a method that can be used without negation? isValid(), isActive(), something else?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. I had considered adding a notExpired method since words like "active" and "valid" are a little ambiguous.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to build something like this on top of time.nanoseconds() since it is monotonic. The downside is that it can't be used in places where a real date/time is expected. Would the latter be an issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I've only updated the consumer code to use Timer. Lower level code like NetworkClient still have APIs which depend on the current time in milliseconds. In a future patch, I think we can change this code to use Timer as well. The ultimate goal would be to have a Timer interface which only exposed durations of times and did not have a direct dependence on wall clock time. Then we could safely use nanoseconds internally.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.nanoseconds() may be more costly than milliseconds on some OS though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this based on real data or hypothetical? :) I shared a link somewhere that shows that this is not true for environments we care about.

Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang Aug 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I did not do any experiments myself, so I'd say it's hypothetical :P It's mainly from some articles I've read before.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing the code and another high-level question occurred to me: should the Timer accept Duration instances?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can do that. I thought about it at one point, but everything was already so millisecond-centric.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A meta comment: this maybe related to #5183, we should let the contributor know about this and whether that PR can be refactored or dropped because of this.

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jun 29, 2018

Sorry, can you please fix the merge conflicts?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we need to update the comment above. Also, why are we replacing the two calls by a single call everywhere in this test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed them because we could get the same effect with a single call and a non-zero timeout. Ack on removing the comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add javadoc to both of these methods?

@hachikuji hachikuji force-pushed the minor-add-common-timer branch from cf9919e to 1913326 Compare June 29, 2018 16:33
Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a pass over non-testing code.

Could we trigger a streams simple benchmark (/tests/kafkatest/benchmarks/streams) just to validate there is no perf regressions?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Java8 :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A meta comment: this maybe related to #5183, we should let the contributor know about this and whether that PR can be refactored or dropped because of this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.nanoseconds() may be more costly than milliseconds on some OS though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still preserved, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify the comment. I was trying to say that the argument is ignored if it is not a monotonic update.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit overkill, but I agree it is cleaner..

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why changing if to a while, and also add a client.poll()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point of this check was to allow some time for pending async commits to return, but the previous code seemed a little bizarre. What was the point of ensuring the coordinator is ready and then immediately closing? It made more sense to turn this into a loop and call poll so that we could give the OffsetCommit responses a chance to be delivered.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about the perf implication here: we call an additional time.millis every time with consumer.poll() now, could we still pass in now and add a reset function that accepts the currentTime like the overloaded update as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're probably still ahead in terms of total calls to time.milliseconds, but I can try to remove this one also.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move this map into the while loop?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@hachikuji hachikuji force-pushed the minor-add-common-timer branch from 1913326 to e6dca32 Compare August 3, 2018 16:48
@guozhangwang
Copy link
Copy Markdown
Contributor

@hachikuji The updates LGTM. Please feel free to merge after Jenkins passed.

@guozhangwang
Copy link
Copy Markdown
Contributor

Triggered https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1891 for simple streams benchmark.

@hachikuji hachikuji force-pushed the minor-add-common-timer branch from c4f2b58 to 0ef4698 Compare August 3, 2018 20:47
@hachikuji hachikuji merged commit fc5f6b0 into apache:trunk Aug 4, 2018
if (!parts.isEmpty())
return parts;

Timer timer = time.timer(requestTimeoutMs);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hachikuji I found this issue overlooked during the review: it should be timeout here..

smccauliff pushed a commit to smccauliff/kafka that referenced this pull request Apr 24, 2019
MINOR: Add Timer to simplify timeout bookkeeping and use it in the consumer (apache#5087)

We currently do a lot of bookkeeping for timeouts which is both error-prone and distracting. This patch adds a new `Timer` class to simplify this logic and control unnecessary calls to system time. In particular, this helps with nested timeout operations. The consumer has been updated to use the new class.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
smccauliff added a commit to linkedin/kafka that referenced this pull request Apr 24, 2019
* cherry pick of: fc5f6b0

MINOR: Add Timer to simplify timeout bookkeeping and use it in the consumer (apache#5087)

We currently do a lot of bookkeeping for timeouts which is both error-prone and distracting. This patch adds a new `Timer` class to simplify this logic and control unnecessary calls to system time. In particular, this helps with nested timeout operations. The consumer has been updated to use the new class.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>

* Restore sensor functionality that was removed from patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants