Skip to content

Make supervisor API similar to submit task API#8810

Merged
jihoonson merged 22 commits intoapache:masterfrom
surekhasaharan:submit-task-api
Nov 20, 2019
Merged

Make supervisor API similar to submit task API#8810
jihoonson merged 22 commits intoapache:masterfrom
surekhasaharan:submit-task-api

Conversation

@surekhasaharan
Copy link
Copy Markdown

@surekhasaharan surekhasaharan commented Nov 2, 2019

Fixes #8662

Description

Make submit task and supervisor API's similar in structure. supervisor API accepts both forms.

New form:

{
  "type": "kafka",
  "spec": {
    "dataSchema": { ... },
    "tuningConfig": { ... },
    "ioConfig": { ... }
  }
}

and old form:

{
  "type": "kafka",
   "dataSchema": { ... },
   "tuningConfig": { ... },
   "ioConfig": { ... }
}

The supervisor spec would allow both formats, but the serialization is done to the old format to allow for backwards compatibility.

This PR has:

  • been self-reviewed.
  • added unit tests or modified existing tests to cover new code paths.
  • been tested in a test Druid cluster.

@surekhasaharan surekhasaharan changed the title accept spec or dataSchema, tuningConfig, ioConfig while submitting Make submit task API similar to supervisor API Nov 2, 2019
@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Nov 6, 2019

This pull request introduces 3 alerts when merging c34c5cf into 3b602da - view on LGTM.com

new alerts:

  • 3 for Dereferenced variable may be null

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Nov 6, 2019

This pull request introduces 1 alert when merging 8d0c0a6 into 6f7fbeb - view on LGTM.com

new alerts:

  • 1 for Dereferenced variable may be null

@jihoonson
Copy link
Copy Markdown
Contributor

Please fix LGTM alert.

@surekhasaharan
Copy link
Copy Markdown
Author

Please fix LGTM alert.

Yeah, i looked at that, it thinks dataSchema can be null here because of test cases, where null is passed for dataSchema, though those tests pass a valid ingestionSchema, and it's already checked in here. Not sure how to make lgtm ignore that or fix otherwise.

@vogievetsky
Copy link
Copy Markdown
Contributor

I am a strong 👍 on the design (I am biased since I filed the original issue).
This will allow us to take a nasty hack out of the web console.

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i looked at that, it thinks dataSchema can be null here because of test cases, where null is passed for dataSchema, though those tests pass a valid ingestionSchema, and it's already checked in here. Not sure how to make lgtm ignore that or fix otherwise.

You can fix it by making a non-null IndexIngestionSpec first and then using it in the constructor.

Also, this PR changes only IndexTask. Are you going to change ParallelIndexSupervisorTask in a follow-up pr?

Comment thread docs/tutorials/tutorial-batch.md Outdated
{
"type" : "index",
"spec" : {
"type" : "index",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong indentation. Should be 2 spaces.

)
{
return makeGroupId(ingestionSchema.ioConfig.appendToExisting, ingestionSchema.dataSchema.getDataSource());
final boolean isValid = (ingestionSchema != null) ^ (dataSchema != null
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should be in the constructor rather than here since it's a method to make a groupId.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makeGroupId is called from the constructor inside this, so can't add another check before calling this

&& ioConfig != null
&& tuningConfig != null);
if (!isValid) {
throw new ISE("invalid spec input");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the error message more user friendly. It should say what's missing or what's duplicate.

}
if (ingestionSchema == null) {
assert (ioConfig != null);
assert (dataSchema != null);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these asserts? Seems duplicate.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were added because of lgtm warnings

@@ -174,6 +191,9 @@ public IndexTask(
@JsonProperty("id") final String id,
@JsonProperty("resource") final TaskResource taskResource,
@JsonProperty("spec") final IndexIngestionSpec ingestionSchema,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be @Deprecated.

false
),
null,
null,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add another constructor to IndexTask which is compatible to the old constructor so that you can minimize the change of this PR. The constructor should be @VisibleForTesting and @Deprecated.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the original constructor back


final String json = jsonMapper.writeValueAsString(task);

Thread.sleep(100); // Just want to run the clock a bit to make sure the task id doesn't change
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. What does it do?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-paste of another test case :). Tried removing the sleep, test case still passes.

@Test(expected = ISE.class)
public void testIndexTaskInvalidSpecSerde()
{
new IndexTask(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a serde test. The test should deserialize from a json string.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this to IndexTaskTest

@surekhasaharan
Copy link
Copy Markdown
Author

Yeah, i looked at that, it thinks dataSchema can be null here because of test cases, where null is passed for dataSchema, though those tests pass a valid ingestionSchema, and it's already checked in here. Not sure how to make lgtm ignore that or fix otherwise.

You can fix it by making a non-null IndexIngestionSpec first and then using it in the constructor.

Also, this PR changes only IndexTask. Are you going to change ParallelIndexSupervisorTask in a follow-up pr?

Yes, want to change ParallelIndexSupervisorTask as well in this PR, working on it.

@surekhasaharan surekhasaharan changed the title Make submit task API similar to supervisor API Make supervisor API similar to submit task API Nov 18, 2019
@surekhasaharan
Copy link
Copy Markdown
Author

Thanks @jihoonson for review and suggestion, I have updated this PR to allow supervisor schema to take in spec (the old format is still valid) similar to index, index_parallel and hadoop submit task APIs

@jihoonson
Copy link
Copy Markdown
Contributor

@surekhasaharan thanks for updating the PR. Would you please merge master so that LGTM can be passed?


private static KafkaSupervisorTuningConfig getTuningConfig(
KafkaSupervisorIngestionSpec ingestionSchema,
KafkaSupervisorTuningConfig tuningConfig
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please annotate both @Nullable.


@JsonCreator
public KafkaSupervisorSpec(
@JsonProperty("spec") KafkaSupervisorIngestionSpec ingestionSchema,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please annotate ingestionSchema, dataSchema, tuningConfig, and ioConfig with @Nullable.

@JsonProperty
public DataSchema getDataSchema()
{
return super.getDataSchema();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unnecessary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And getDataSchema() should be deprecated.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed from subclass and deprecated in base class


@Override
@JsonProperty
public KafkaSupervisorTuningConfig getTuningConfig()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and getIoConfig() should be deprecated now.

@JsonProperty
public KafkaSupervisorIngestionSpec getSpec()
{
return (KafkaSupervisorIngestionSpec) super.getSpec();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nullable

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm sorry now it's not nullable.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, yeah removed

protected final ObjectMapper mapper;
protected final RowIngestionMetersFactory rowIngestionMetersFactory;
private final SeekableStreamSupervisorIngestionSpec ingestionSchema;
private final DataSchema dataSchema;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class now has duplicate information between ingestionSchema and dataSchema, ioConfig, and tuningConfig. I suggest to remove dataSchema, ioConfig, and tuningConfig and keep only ingestionSchema.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will still be required in subclasses KafkaSupervisorSpec and KinesisSupervisorSpec for backwards compatibility, so there will be some duplication.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to be duplicate. You can always create ingestionSchema and get things from it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok right, removed the extra configs from this class.

@JsonProperty
public KafkaSupervisorIngestionSpec getSpec()
{
return (KafkaSupervisorIngestionSpec) super.getSpec();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm sorry now it's not nullable.

}

@Override
@Nullable
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. It's not nullable now.

@jihoonson
Copy link
Copy Markdown
Contributor

Thanks. I left a couple of trivial comments. LGTM after removing wrong nullable annotations.

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 after CI. Thanks!

@jihoonson jihoonson merged commit d628beb into apache:master Nov 20, 2019
@surekhasaharan surekhasaharan deleted the submit-task-api branch November 20, 2019 18:43
jon-wei pushed a commit to jon-wei/druid that referenced this pull request Nov 26, 2019
* accept spec or dataSchema, tuningConfig, ioConfig while submitting task json

* fix test

* update docs

* lgtm warning

* Add original constructor back to IndexTask to minimize changes

* fix indentation in docs

* Allow spec to be specified in supervisor schema

* undo IndexTask spec changes

* update docs

* Add Nullable and deprecated annotations

* remove deprecated configs from SeekableStreamSupervisorSpec

* remove nullable annotation
@jon-wei jon-wei added this to the 0.17.0 milestone Dec 17, 2019
@jon-wei jon-wei mentioned this pull request Dec 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Normalize the submit task API to be more like the supervisor API

4 participants