Conversation
techdocsmith
left a comment
There was a problem hiding this comment.
This is looking good. all my suggestions are stylistic
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
cecemei
left a comment
There was a problem hiding this comment.
request some change regarding incremental metadata cache and supervisor based compaction, we made several changes so some notes are outdated in the same release.
|
|
||
| [#18923](https://github.com/apache/druid/pull/18923) | ||
|
|
||
| ### Auto-compaction with compaction supervisors |
There was a problem hiding this comment.
can you add #19252 here as well? Overlord-based compaction supervisors are now the recommended and default approach for automatic compaction.
|
|
||
| #### Segment locking | ||
|
|
||
| Segment locking and `NumberedOverwriteShardSpec` are deprecated and will be removed in a future release. Use time chunk locking instead. You can make sure only time chunk locking is used by setting `druid.indexer.tasklock.forceTimeChunkLock` to `true`. |
There was a problem hiding this comment.
maybe worth mentioning druid.indexer.tasklock.forceTimeChunkLock is true by default, i guess we should have deprecate segment locking long long time ago, cc: @clintropolis .
cecemei
left a comment
There was a problem hiding this comment.
overall lgtm, left a few minor comments
| #### Changed storage column displays | ||
|
|
||
| - Improved the compaction config view to | ||
| - Renamed **Current size** to **Assigned size**. | ||
| - Renamed **Max size** to **Effective size**. It now displays the smaller value between `max_size` and `storage_size`. The max size is still shown as a tooltip. | ||
| - Changed usage calculation to use `effective_size` | ||
|
|
||
| [#19007](https://github.com/apache/druid/pull/19007) | ||
|
|
There was a problem hiding this comment.
this part doesnt read smooth to me, this is part of showing storage metric for data nodes i believe, this change is also related with vsf, do you mind take a look at this, @clintropolis ?
There was a problem hiding this comment.
yea, this is related to vsf mode where the actual size disk size is separate from the amount of data the node is responsible for, so we updated this UI to be able to show the ratio of assigned segment size to actual disk size. No real change other than labels if not vsf mode
|
|
||
| ### Query blocklist | ||
|
|
||
| You can now use the using the `/druid/coordinator/v1/config/broker` API to create a query blocklist to dynamically block queries by datasource, query type, or query context. The blocklist takes effect without a restarting Druid. Block rules use `AND` logic, which means all criteria must match. |
There was a problem hiding this comment.
You can now use the using the ...
this seems to be missing something
|
|
||
| ### Thrift input format | ||
|
|
||
| As part of the Thrift contributor extension, Druid now supports Thrift-encoded data for Kafka and Kinesis streaming ingestion. |
There was a problem hiding this comment.
technically we already supported it before, but only when using the deprecated parser/parsespec stuff that has been removed in this release, so the new thing is that it supports the modern InputFormat stuff
|
|
||
| - Added the `druid.storage.transfer.asyncHttpClientType` config that specifies which async HTTP client to use for S3 transfers: `crt` for Amazon CRT or `netty` for Netty NIO [#19249](https://github.com/apache/druid/pull/19249) | ||
| - Added a mechanism to automatically clean up intermediary files on HDFS storage [#19187](https://github.com/apache/druid/pull/19187) | ||
| - Changed the default value for `druid.indexing.formats.maxStringLength` to null from 0 [#19198](https://github.com/apache/druid/pull/19198) |
There was a problem hiding this comment.
this probably doesn't need documented separately from columnFormatSpec on the string column since it is related to that
|
|
||
| ### Ingestion | ||
|
|
||
| - Added the `maxStringLength` configuration for string dimensions that truncates values exceeding the specified length during ingestion. You can set the length globally using `druid.indexing.formats.maxStringLength` or per-dimension in the ingestion spec [#19146](https://github.com/apache/druid/pull/19146) |
There was a problem hiding this comment.
this was refactored into columnFormatSpec on the string column in #19258, this and the other related bullet should all be combined
|
|
||
| ### Incompatible changes | ||
|
|
||
| #### Removed `ParseSpec` and deprecated parsers |
There was a problem hiding this comment.
this should be combined or at least next to the Parsers section since it is related (also related to hadoop removal since hadoop ingestion only supported parser/parseSpec which is why we had to drag them around for so long)
|
|
||
| ### Tombstones | ||
|
|
||
| Tombstones for JSON-based native batch ingestion (the `dropExisting` flag for `ioConfig`) are now generally available. |
There was a problem hiding this comment.
this should probably indicate that this was more of an oversight/mistake of not updating docs than really anything changing since these have been used in production for quite a long time, and are not even optional in msq
| - `bytebuddy` from `1.18.3` to `1.18.5` [#19145](https://github.com/apache/druid/pull/19145) | ||
| - Added `objenesis` `3.5` [#19145](https://github.com/apache/druid/pull/19145) | ||
| - `org.apache.zookeeper` from 3.8.4 to 3.8.6 [#19135](https://github.com/apache/druid/pull/19135) | ||
| - Added AWS SDK `2.40.0` [#18891](https://github.com/apache/druid/pull/18891) |
There was a problem hiding this comment.
this is a pretty big update to aws sdk libraries going from 1.x to 2.x, maybe deserves more callout
There was a problem hiding this comment.
+1 IMHO, we should call this out in upgrade notes.
|
|
||
| [#18966](https://github.com/apache/druid/pull/18966) | ||
|
|
||
| #### Other metrics and monitoring improvements |
There was a problem hiding this comment.
I think this is worth calling out as an experimental perf feature for task init speedup:
#19022
|
|
||
| Added `segment/schemaCache/rowSignature/changed` and `segment/schemaCache/rowSignature/column/count` metrics to expose events when the Broker initializes and updates the row signature in the the segment metadata cache for each datasource. | ||
|
|
||
| [#18966](https://github.com/apache/druid/pull/18966) |
There was a problem hiding this comment.
Bug fix #19162 appears to be missing from the release notes
I can’t recall whether both a “Release note” section in the PR summary and a release note GitHub label are required for changes to appear in the release notes docs
There was a problem hiding this comment.
We typically don't include bug fixes in the release notes unless requested. I'm not sure why historically, but that's been the process since I've worked on Druid
|
|
||
| #### Other cluster management improvements | ||
|
|
||
| - Added a `ReadOnly` authorizer that allows all READ operations but denies any other operation, such as WRITE [#19243](https://github.com/apache/druid/pull/19243) |
Co-authored-by: Cece Mei <yingqian.mei@gmail.com>
Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com>
|
|
||
| #### Filtering metrics | ||
|
|
||
| Operators can set `druid.emitter.logging.shouldFilterMetrics=true` to limit which metrics the logging emitter writes. Optionally, they can set `druid.emitter.logging.allowedMetricsPath` to a JSON object file where the keys are metric names. A missing custom file results in a warning and use of the bundled `defaultMetrics.json`. Alerts and other non-metric events are always logged. |
Fixes #XXXX.
Description
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Release note
Key changed/added classes in this PR
MyFooOurBarTheirBazThis PR has: