Skip to content

Make timeout behavior consistent to document#4134

Merged
gianm merged 10 commits intoapache:masterfrom
jihoonson:fix-timeout
Apr 19, 2017
Merged

Make timeout behavior consistent to document#4134
gianm merged 10 commits intoapache:masterfrom
jihoonson:fix-timeout

Conversation

@jihoonson
Copy link
Copy Markdown
Contributor

@jihoonson jihoonson commented Mar 30, 2017

Reported here.

  • The default timeout is 0 which was previously maxIdleTime.
  • 0 timeout means no timeout which previously means no wait.

This change is Reviewable

@fjy fjy added the Bug label Mar 30, 2017
@fjy fjy added this to the 0.10.1 milestone Mar 30, 2017
@jihoonson
Copy link
Copy Markdown
Contributor Author

Travis failure seems not related to this patch. All tests passed in my local machine.

timeoutAt - (System.currentTimeMillis() - start)
);
if (hasTimeout) {
responseContext.put(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI #4112

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for letting me know. This behavior looks weird, so I'd like to fix it in this pr if it doesn't require so large change. Since I don't know exactly how the responseContext is supposed to be used, let me make sure what the correct solution is. Is it ok to just move CTX_TIMEOUT_AT to QueryContext instead of ResponseContext?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not that easy to fix this, because CTX_TIMEOUT_AT and CTX_COUNT are used in some non-obvious way in ScanQueryEngine

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I'll look at after this pr.

final Number queryTimeout = query.getContextValue(QueryContextKeys.TIMEOUT, null);
final long timeoutAt = (queryTimeout == null || queryTimeout.longValue() == 0L)
? JodaUtils.MAX_INSTANT : System.currentTimeMillis() + queryTimeout.longValue();
final long timeoutAt = System.currentTimeMillis() + QueryContexts.getTimeout(query);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here timeoutAt is set to System.currentTimeMillis() if timeout is not specified, is that what you wanted to do?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment in code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Number timeout = query.getContextValue(QueryContextKeys.TIMEOUT);
if (timeout == null) {
final long timeout = QueryContexts.getTimeout(query);
if (timeout == 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryContexts.DEFAULT_TIMEOUT? Or add a method QueryContext.isNoTimeout(timeout)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return new MergeIterable<>(
ordering.nullsFirst(),
timeout == null ?
timeout == 0 ?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

final Number timeout = query.getContextValue(QueryContextKeys.TIMEOUT, (Number) null);
if (timeout == null) {
final long timeout = QueryContexts.getTimeout(query);
if (timeout == 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public Sequence<T> run(final Query<T> query, Map<String, Object> responseContext)
{
if (BaseQuery.getContextBySegment(query, false)) {
if (QueryContexts.isBySegment(query, false)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isBySegment is always called with false as the default argument, maybe remove the parameter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

throw new TimeoutException();
}
} else {
mergeBufferHolder = mergeBufferPool.take(-1);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of blocking APIs from JDK to have a method that takes a timeout and timeUnit, and a method without parameters for indefinite blocking, rather than encoding this by negative argument.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

final long timeout = QueryContexts.getTimeout(query);
final ResourceHolder<List<ByteBuffer>> mergeBufferHolders = mergeBufferPool.takeBatch(
requiredMergeBufferNum, timeout.longValue()
requiredMergeBufferNum, timeout
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

takeBatch() javadoc says that "negative means no timeout", here timeout is QueryContext.DEFAULT_TIMEOUT, that is 0.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

final Number timeout = query.getContextValue(QueryContextKeys.TIMEOUT, (Number) null);
return timeout == null ? future.get() : future.get(timeout.longValue(), TimeUnit.MILLISECONDS);
final long timeout = QueryContexts.getTimeout(query);
return timeout == 0 ? future.get() : future.get(timeout, TimeUnit.MILLISECONDS);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

)
);

if (QueryContexts.getTimeout(query) < 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check could be moved into QueryContexts.getTimeout()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.

*
* @return a resource, or null if the timeout was reached
*/
public ReferenceCountingResourceHolder<T> take(final long timeout)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to add a time unit parameter (which in my choice even if the method is always called with the same time unit argument, for readability on the caller side), please rename the parameter to "timeoutMs".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

}
}
);
return wrapObject(timeout > 0 ? pollObject(timeout) : pollObject());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is a dedicated no-timeout method, in this method I suggest to check if timeout is positive and throw an exception otherwise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check.

queryWatcher.registerQuery(query, future);
final long timeout = QueryContexts.getTimeout(query);
if (timeout == 0) {
if (QueryContexts.isNoTimeout(timeout)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's clearer here:

if (QueryContext.hasTimeout(query)) {
  future.get(QueryContext.getTimeout(query), TimUnit.MILLISECONDS);
} else {
  future.get();
}

This is also true for other places where isNoTimeout() is used, except the one in ChainedExecutionQueryRunner.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jihoonson please comment on this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry I missed this comment. Yeah, it looks good. I changed including ChainedExecutionQueryRunner. It also looks simple and good.

final Number queryTimeout = query.getContextValue(QueryContextKeys.TIMEOUT, null);
final long timeoutAt = (queryTimeout == null || queryTimeout.longValue() == 0L)
? JodaUtils.MAX_INSTANT : System.currentTimeMillis() + queryTimeout.longValue();
final long timeoutAt = System.currentTimeMillis() + QueryContexts.getTimeout(query);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment in code

timeoutAt - (System.currentTimeMillis() - start)
);
if (hasTimeout) {
responseContext.put(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not that easy to fix this, because CTX_TIMEOUT_AT and CTX_COUNT are used in some non-obvious way in ScanQueryEngine

@jihoonson
Copy link
Copy Markdown
Contributor Author

@leventov, thanks. I've addressed your comments.

* @return a resource, or null if the timeout was reached
*/
public ReferenceCountingResourceHolder<T> take(final long timeout)
public ReferenceCountingResourceHolder<T> take(final long timeoutMs)
Copy link
Copy Markdown
Member

@leventov leventov Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still has non-obvious undocumented corner case behaviour if timeoutMs=0, than it waits indefinitely, rather than not waiting at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment.. If timeoutMs = 0, then take() immediately returns an object or null without waiting at all. Actually, negative timeoutMss have the same meaning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my oversight

queryWatcher.registerQuery(query, future);
final long timeout = QueryContexts.getTimeout(query);
if (timeout == 0) {
if (QueryContexts.isNoTimeout(timeout)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about this?

Copy link
Copy Markdown
Member

@leventov leventov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now isNoTimeout() seems to be unused

@jihoonson
Copy link
Copy Markdown
Contributor Author

@leventov I first thought that it could be used in the future, so kept it. But, yeah, it's simple and can add again if necessary. I removed now.

@jihoonson
Copy link
Copy Markdown
Contributor Author

Would anyone please review this patch?

@gianm
Copy link
Copy Markdown
Contributor

gianm commented Apr 18, 2017

Taking a look.

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than comment about how the default query timeout config is dealt with.

queryId = UUID.randomUUID().toString();
query = query.withId(queryId);
}
if (query.getContextValue(QueryContextKeys.TIMEOUT) == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the default behavior, how about keeping the default behavior the same, but fixing the docs and making it more configurable? So I'm suggesting:

  • Add a druid.server.http.defaultQueryTimeout property and document that in configuration/broker.md and configuration/historical.md. Use that to set the default timeout here, and default that property to PT5M.
  • Update query-context.md to reflect that the default isn't 0, it's the value of druid.server.http.defaultQueryTimeout.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I changed the default query timeout to be configurable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @jihoonson, I wasn't clear. For backwards compatibility reasons the query context parameter should stay milliseconds. Since druid.server.http.defaultQueryTimeout is new it could either be milliseconds or a period.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I changed both of them to millis.

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@gianm gianm merged commit 5b69f2e into apache:master Apr 19, 2017

Query<T> withDataSource(DataSource dataSource);

Query<T> withDefaultTimeout(long defaultTimeout);
Copy link
Copy Markdown
Member

@leventov leventov Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with #4131 was that it breaks compatibility with extension queries, doesn't this PR break it already? @gianm @cheddar

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I missed this and we need to fix it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch in #4185.

@jihoonson jihoonson deleted the fix-timeout branch December 5, 2018 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants