Skip to content

add DataSchema.Builder to tidy stuff up a bit#17065

Merged
gianm merged 7 commits intoapache:masterfrom
clintropolis:dataschema-builder
Sep 15, 2024
Merged

add DataSchema.Builder to tidy stuff up a bit#17065
gianm merged 7 commits intoapache:masterfrom
clintropolis:dataschema-builder

Conversation

@clintropolis
Copy link
Copy Markdown
Member

changes:

  • DataSchema now only has a single constructor (the JSON creator)
  • Tests transitioned to use DataSchema.builder() to simplify things a bit, as well as insulate against changes to the JSON schema

@github-actions github-actions Bot added Area - Batch Ingestion Area - Streaming Ingestion Kubernetes Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 14, 2024
Comment on lines +329 to +337
new StringInputRowParser(
new JSONParseSpec(
new TimestampSpec(null, null, null),
DimensionsSpec.EMPTY,
null,
null,
null
)
),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [StringInputRowParser.StringInputRowParser](1) should be avoided because it has been deprecated.
Comment on lines +112 to +128
new StringInputRowParser(
new JSONParseSpec(
new TimestampSpec("timestamp", "iso", null),
new DimensionsSpec(
Arrays.asList(
new StringDimensionSchema("dim1"),
new StringDimensionSchema("dim1t"),
new StringDimensionSchema("dim2"),
new LongDimensionSchema("dimLong"),
new FloatDimensionSchema("dimFloat")
)
),
new JSONPathSpec(true, ImmutableList.of()),
ImmutableMap.of(),
false
)
),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation

Invoking [StringInputRowParser.StringInputRowParser](1) should be avoided because it has been deprecated.
Comment on lines +444 to +456
DataSchema schema = DataSchema.builder()
.withDataSource(IdUtilsTest.VALID_ID_CHARS)
.withParserMap(parser)
.withAggregators(
new DoubleSumAggregatorFactory("metric1", "col1"),
new DoubleSumAggregatorFactory("metric2", "col2"),
new DoubleSumAggregatorFactory("metric1", "col3"),
new DoubleSumAggregatorFactory("metric3", "col4"),
new DoubleSumAggregatorFactory("metric3", "col5")
)
.withGranularity(ARBITRARY_GRANULARITY)
.withObjectMapper(jsonMapper)
.build();

Check notice

Code scanning / CodeQL

Unread local variable

Variable 'DataSchema schema' is never read.
Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left some thoughts that came to mind.

.withDimensions(dimensionsAndAggregators.lhs)
.withAggregators(dimensionsAndAggregators.rhs.toArray(new AggregatorFactory[0]))
.withGranularity(makeGranularitySpecForIngestion(querySpec.getQuery(), querySpec.getColumnMappings(), isRollupQuery, jsonMapper))
.withTransform(new TransformSpec(null, Collections.emptyList()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could leave this line out, if the default is a no-op TransformSpec.

return this;
}

public Builder withParserMap(Map<String, Object> parserMap)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include @Deprecated to match the annotation in the constructor?

return this;
}

public Builder withObjectMapper(ObjectMapper objectMapper)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include @Deprecated, since this is only used with the parserMap, which is itself deprecated?

Copy link
Copy Markdown

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@gianm gianm merged commit aa6336c into apache:master Sep 15, 2024
@clintropolis clintropolis deleted the dataschema-builder branch September 15, 2024 21:50
pranavbhole pushed a commit to pranavbhole/druid that referenced this pull request Sep 17, 2024
* add DataSchema.Builder to tidy stuff up a bit

* fixes

* fixes

* more style fixes

* review stuff
clintropolis added a commit to clintropolis/druid that referenced this pull request Oct 5, 2024
* add DataSchema.Builder to tidy stuff up a bit

* fixes

* fixes

* more style fixes

* review stuff
abhishekagarwal87 pushed a commit that referenced this pull request Oct 5, 2024
* abstract `IncrementalIndex` cursor stuff to prepare for using different "views" of the data based on the cursor build spec (#17064)

* abstract `IncrementalIndex` cursor stuff to prepare to allow for possibility of using different "views" of the data based on the cursor build spec
changes:
* introduce `IncrementalIndexRowSelector` interface to capture how `IncrementalIndexCursor` and `IncrementalIndexColumnSelectorFactory` read data
* `IncrementalIndex` implements `IncrementalIndexRowSelector`
* move `FactsHolder` interface to separate file
* other minor refactorings

* add DataSchema.Builder to tidy stuff up a bit (#17065)

* add DataSchema.Builder to tidy stuff up a bit

* fixes

* fixes

* more style fixes

* review stuff

* Projections prototype (#17214)
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 10, 2024
@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Oct 10, 2024

This has already been backported in #17257 , so added it to the milestone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Streaming Ingestion Kubernetes Refactoring

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants