Skip to content

add manual laning strategy, integration test#9492

Merged
clintropolis merged 10 commits intoapache:masterfrom
clintropolis:manual-query-laning
Mar 14, 2020
Merged

add manual laning strategy, integration test#9492
clintropolis merged 10 commits intoapache:masterfrom
clintropolis:manual-query-laning

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented Mar 10, 2020

Description

This PR is part of #6993, and a follow-up to #9407 which adds a ManualLaningStrategy that in addition to being useful to make it easy to add an integration test for the query laning functionality, is also well suited for cases where one or more external applications which query Druid are able to manually decide which lane a query belongs to by adding a lane parameter to the query context.

example config:

druid.query.scheduler.laning.strategy=manual
druid.query.scheduler.laning.isLimitPercent=true
druid.query.scheduler.laning.lanes.ten=10
druid.query.scheduler.laning.lanes.twenty=20
druid.query.scheduler.laning.lanes.fifty=50

The example defines 3 lanes, "ten", "twenty", and "fifty", which if the query context contains that lane name will be allowed to use up to 10%, 20%, or 50% of the total capacity.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Contributor

@maytasm maytasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some comments about possible improvement in tests

StatusResponseHolder status = future.get();
if (status.getStatus().getCode() == QueryCapacityExceededException.STATUS_CODE) {
limited++;
Assert.assertTrue(status.getContent().contains("one"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert this against StringUtils.format(QueryCapacityExceededException.ERROR_MESSAGE_TEMPLATE, "one") instead?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Assert.assertTrue(success > 0);
Assert.assertTrue(limited > 0);

// test another to make sure we can still issue one query at a time
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be separate into another @test method. Will make understanding each test case easier.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is testing a sequence of events though, so it belongs in the same test

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the top part is test where we exceed and expect rejection. The bottom part is test where we are always under the limit and never exceed hence expect all to be successful.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a separate test which runs a bunch of queries the same manner that do not belong to a lane to make sure they are not limited

final QueryScheduler scheduler = provider.get().get().get();
Assert.assertEquals(10, scheduler.getTotalAvailableCapacity());
Assert.assertEquals(1, scheduler.getLaneAvailableCapacity("one"));
Assert.assertEquals(2, scheduler.getLaneAvailableCapacity("two"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also test negative case (like lane name "three") and see we get NO_CAPACITY?

Copy link
Copy Markdown
Member Author

@clintropolis clintropolis Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually true currently, non-existent lanes are currently treated the same as not having a lane. I did this in #9407 because it seemed friendlier, but an argument can probably made for making undefined lanes that are present be treated as having 0 capacity which would make what you are suggesting true. If we wanted to make this change, we could either do it across the board in QueryScheduler, or leave the decision to the QueryLaningStrategy which is a bit more flexible.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I maybe interpreted this as something deeper, suggesting that we should treat queries that specify non-existent lanes as a 0 capacity lane, because of my poorly named NO_CAPACITY variable which I am going to rename as UNAVAILABLE pending this discussion. If you're just suggesting that we check that non-existent lanes return -1 I can add that test.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im just suggesting to add test that check that non-existent lanes return -1

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some asserts to existing tests to show that non-existent lane is unavailable

final QueryScheduler scheduler = provider.get().get().get();
Assert.assertEquals(10, scheduler.getTotalAvailableCapacity());
Assert.assertEquals(1, scheduler.getLaneAvailableCapacity("one"));
Assert.assertEquals(2, scheduler.getLaneAvailableCapacity("twenty"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also test negative case (like lane name "three") and see we get NO_CAPACITY?

}

@Override
public <T> Optional<String> computeLane(QueryPlus<T> query, Set<SegmentServerSelector> segments)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering..why does computeLane needs SegmentServerSelector? Doesn't laning or not should just depend on the query context that was given with the query?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QueryLaningStrategy are intended to compute the lane on the query context, which the QueryScheduler can then use to enforce limits. This laning strategy is a bit of a special case because it only preserves what was already there, but other strategies could consider all sorts of information to allow them to make informed decisions about lane assignment based on details of the query.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for the clarification

}

@Test
public void testPercentLaneLimitsMustBeAboveZero()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testPercentLaneLimitsMustBeBelowOneHundred?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more tests

queryHelper.testQueriesFromFile(WIKIPEDIA_QUERIES_RESOURCE, 2);
}

@Test
Copy link
Copy Markdown
Contributor

@maytasm maytasm Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also have a test for when the Manual Lane From Context That Arent In Map is given? For example, if lane in query context is "some-unknown-lane"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess they will always be rejected since we give NO_CAPACITY right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ update typo above. Meant to say "also" have a test for case when the Manual Lane From Context That Arent In Map is given

Comment thread docs/configuration/index.md Outdated

|Property|Description|Default|
|--------|-----------|-------|
|`druid.query.scheduler.laning.lanes.{name}`|Maximum percent or exact limit of queries that can concurrently run in the defined lanes. Any number of lanes may be defined like this.|No default, must define at least one lane with a limit above 0. If `druid.query.scheduler.laning.isLimitPercent` is set to `true`, numbers must be in the range of 1 to 100.|
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numbers must be in the range of 1 to 100.

Should it say the number type is the integer?

// the broker is configured with 2 manually defined query lanes, 'one' with limit 1, and 'two' with limit 'two'
// -Ddruid.query.scheduler.laning.type=manual
// -Ddruid.query.scheduler.laning.lanes.one=1
// -Ddruid.query.scheduler.laning.lanes.two=2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the lane 'two' is not used. Perhaps better to remove it for now.

this.isLimitPercent = isLimitPercent != null ? isLimitPercent : false;
Preconditions.checkArgument(lanes.size() > 0, "lanes must define at least one lane");
Preconditions.checkArgument(
lanes.values().stream().allMatch(x -> this.isLimitPercent ? 0 < x && x <= 100 : x > 0),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it check the sum of percents of lanes is less than or equal to 100 when isLimitPercent = true?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it is necessary to enforce that since lane limits are currently an 'at most' guarantee instead of an 'at least'; so it seems possibly legitimate to say configure two lanes and allow either of them to use 75% of total capacity.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see. Then should this be documented?

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@clintropolis
Copy link
Copy Markdown
Member Author

thanks for review @maytasm3 and @jihoonson

@clintropolis clintropolis merged commit 69af760 into apache:master Mar 14, 2020
@clintropolis clintropolis deleted the manual-query-laning branch March 14, 2020 03:07
@jihoonson jihoonson added this to the 0.18.0 milestone Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants