[Backport] Logic adjustments to SeekableStreamIndexTaskRunner. by clintropolis · Pull Request #7271 · apache/druid

clintropolis · 2019-03-15T17:12:04Z

Backport of #7267 to 0.14.0-incubating.

* Add null checks in DruidSchema * Add unit tests * Add VisibleForTesting annotation * PR comments * unused import

…ache#6938) (apache#7022) * Improper equals override is fixed to prevent NullPointerException * Fixed curly brace indentation. * Test method is added for equals method of TaskLockPosse class.

…es().size (apache#7000) (apache#7013) * Instead of using keyCount, changing it to check the size of objectSummaries. For issue: apache#6980 apache#6980 (comment) * Changing another usage of keyCount with size of objectSummaries. * Adding some comments to explain why using keyCount is not working as expected.

* Improper getter value is fixed. * Test class is added.

…pache#7023) (apache#7040) * Fix filterSegments for TimeBoundary and DataSourceMetadata queries * add javadoc * fix build

apache#7015) (apache#7045) * Fix: 1. hadoop-common dependency for druid-hdfs and druid-kerberos extensions Refactoring: 2. Hadoop config call in the inner static class to avoid class path conflicts for stopGracefully kill * Fix: 1. hadoop-common test dependency * Fix: 1. Avoid issue of kill command once the job is actually completed

* Fix the bug with num_rows in sys.segments * Fix segmentMetadataInfo update in DruidSchema * Add numRows to SegmentMetadataHolder builder's constructor, so it's not overwritten * Rename SegSegmentSignature to setSegmentMetadataHolder and fix it so nested map is appended instead of recreated * Replace Map<String, Set<String>> segmentServerMap with Set<String> for num_replica * Remove unnecessary code and update test * Add unit test for num_rows * PR comments * change access modifier to default package level * minor changes to comments * PR comments

…pache#6990) (apache#7055) * fix kafka index task doesn't resume when recieve duplicate request * add unit test

- moving description of coordinator isLeader endpoint

* document middle manager api * re-arrange * correction * document more missing overlord api calls, minor re-arrange of some code i was referencing * fix it * this will fix it * fixup * link to other docs

* Add an api to get all lookup specs * add doc

…7044) (apache#7103) * Add doc for Hadoop-based ingestion vs Native batch ingestion * add links * add links

…ache#7104)

) (apache#7101) * add doc * change docs * PR comments * few more changes

…pache#7046) (apache#7113) * index_parallel: support !appendToExisting with no explicit intervals This enables ParallelIndexSupervisorTask to dynamically request locks at runtime if it is run without explicit intervals in the granularity spec and with appendToExisting set to false. Previously, it behaved as if appendToExisting was set to true, which was undocumented and inconsistent with IndexTask and Hadoop indexing. Also, when ParallelIndexSupervisorTask allocates segments in the explicit interval case, fail if its locks on the interval have been revoked. Also make a few other additions/clarifications to native ingestion docs. Fixes apache#6989. * Review feedback. PR description on GitHub updated to match. * Make native batch ingestion partitions start at 0 * Fix to previous commit * Unit test. Verified to fail without the other commits on this branch. * Another round of review * Slightly scarier warning

…#6937) (apache#7140) * [apache#1332] Fix - select failing if milis used for idx. * Formating correction. * Address comment: throw original exception. * Using constant values in tests - Try converting to Integer and then multiply by 1000L to achieve milis. - If not successful try converting to Long or rethrow original exception. * DateTime#of has to support "2011-01-01T00:00:00" - in addition to seconds and milisecs, this method currently supports even a date string. * Handle only milisec timestamps and ISO8601 strings

…tGranularity = true (apache#7079) (apache#7163) * Improve compaction tutorial to demonstrate compaction with keepSegmentGranularity = true * typo * add a warning

…#7162)

* Add web consoles doc page * PR comments * Remove 'unified' * PR comments * Fix TOC * PR comments * More revisions * GUI -> UI * Update router docs * Reword router doc

…e#7129) (apache#7166) * Fix exception when the scheme is missing in endpointUrl for S3 * add null check

…a Apache 2.0 license (apache#7139) (apache#7141) * revert bp * fix tests * move @types/hjson to dev dep * removed all the package upgrades

…ncurrentModificationException (apache#6690) (apache#7165) * bugfix: when building materialized-view, if taskCount >1, may cause ConcurrentModificationException * remove entry after iteration instead of using ConcurrentMap, and add unit test * small change * modify unit test for coverage * remove unused method

…e#7146) * segment metadata fallback analysis if no bitmaps * remove accidental line * remove nonsense size estimation * less ternary * fix it * do the thing

… (apache#7177) * Fix supported file formats * address comment

* add license checker to web-console to ensure npm dependencies are apache license compatible * add generate licenses file * update check to remove excludes due to blueprintjs downgrade

* Fix two SeekableStream serde issues. 1) Fix backwards-compatibility serde for SeekableStreamPartitions. It is needed for split 0.13 / 0.14 clusters to work properly during a rolling update. 2) Abstract classes don't need JsonCreator constructors; remove them. * Comment fixes.

* Update LICENSE and NOTICE files * Update react-table version

* Improve doc for auto compaction * fix doc * address comments

…e#7181) (apache#7200) * Reduce # of max subTasks to 2 * fix typo and add more doc * add more doc and link * change default and add warning * fix doc * add test * fix it test

* Densify swapped hll buffer * Make test loop limit pre-increment * Reformat * Fix test comments

* rename maintenance mode to decommission * review changes * missed one * fix straggler, add doc about decommissioning stalling if no active servers * fix missed typo, docs * refine docs * doc changes, replace generals * add explicit comment to mention suppressed stats for balanceTier * rename decommissioningVelocity to decommissioningMaxSegmentsToMovePercent and update docs * fix precondition check * decommissioningMaxPercentOfMaxSegmentsToMove * fix test * fix test * fixes

* Exclude node_modules from src assembly * Remove git.version exclusion * Include binary LICENSE/NOTICE in source assembly

…ache#7261) * Fix record validation in SeekableStreamIndexTaskRunner * fix validation

* wip * fix tests, stop reading if we are at end offset * fix build * remove restore at end offsets fix in favor of a separate PR * use typereference from method for serialization too

* Fix KafkaRecordSupplier assign * TeamCity fix

…xTaskTest (apache#7264) (apache#7270) * Fix testIncrementalHandOffReadsThroughEndOffsets in Kafka/KinesisIndexTaskTest * revert unnecessary change * fix test * remove debug log

* Logic adjustments to SeekableStreamIndexTaskRunner. A mix of simplifications and bug fixes. They are intermingled because some of the bugs were made difficult to fix, and also more likely to happen in the first place, by how the code was structured. I tried to keep restructuring to a minimum. The changes are: - Remove "initialOffsetsSnapshot", which was used to determine when to skip start offsets. Replace it with "lastReadOffsets", which I hope is more intuitive. (There is a connection: start offsets must be skipped if and only if they have already been read, either by a previous task or by a previous sequence in the same task, post-restoring.) - Remove "isStartingSequenceOffsetsExclusive", because it should always be the opposite of isEndOffsetExclusive. The reason is that starts are exclusive exactly when the prior ends are inclusive: they must match up in that way for adjacent reads to link up properly. - Don't call "seekToStartingSequence" after the initial seek. There is no reason to, since we expect to read continuous message streams throughout the task. And calling it makes offset-tracking logic trickier, so better to avoid the need for trickiness. I believe the call being here was causing a bug in Kinesis ingestion where a message might get double-read. - Remove the "continue" calls in the main read loop. They are bad because they prevent keeping currOffsets and lastReadOffsets up to date, and prevent us from detecting that we have finished reading. - Rework "verifyInitialRecordAndSkipExclusivePartition" into "verifyRecordInRange". It no longer has side effects. It does a sanity check on the message offset and also makes sure that it is not past the endOffsets. - Rework "assignPartitions" to replace inline comparisons with "isRecordAlreadyRead" and "isMoreToReadBeforeReadingRecord" calls. I believe this fixes an off-by-one error with Kinesis where the last record would not get read. It also makes the logic easier to read. - When doing the final publish, only adjust end offsets of the final sequence, rather than potentially adjusting any unpublished sequence. Adjusting sequences other than the last one is a mistake since it will extend their endOffsets beyond what they actually read. (I'm not sure if this was an issue in practice, since I'm not sure if real world situations would have more than one unpublished sequence.) - Rename "isEndSequenceOffsetsExclusive" to "isEndOffsetExclusive". It's shorter and more clear, I think. - Add equals/hashCode/toString methods to OrderedSequenceNumber. Kafka test changes: - Added a Kafka "testRestoreAtEndOffset" test to verify that restores at the very end of the task lifecycle still work properly. Kinesis test changes: - Renamed "testRunOnNothing" to "testRunOnSingletonRange". I think that given Kinesis semantics, the right behavior when start offset equals end offset (and there aren't exclusive partitions set) is to read that single offset. This is because they are both meant to be treated as inclusive. - Adjusted "testRestoreAfterPersistingSequences" to expect one more message read. I believe the old test was wrong; it expected the task not to read message number 5. - Adjusted "testRunContextSequenceAheadOfStartingOffsets" to use a checkpoint starting from 1 rather than 2. I believe the old test was wrong here too; it was expecting the task to start reading from the checkpointed offset, but it actually should have started reading from one past the checkpointed offset. - Adjusted "testIncrementalHandOffReadsThroughEndOffsets" to expect 11 messages read instead of 12. It's starting at message 0 and reading up to 10, which should be 11 messages. * Changes from code review.

clintropolis and others added 30 commits February 6, 2019 15:25

Add null checks in DruidSchema (apache#6830) (apache#7018)

4cb631d

* Add null checks in DruidSchema * Add unit tests * Add VisibleForTesting annotation * PR comments * unused import

Improper equals override is fixed to prevent NullPointerException (ap…

6894c9f

…ache#6938) (apache#7022) * Improper equals override is fixed to prevent NullPointerException * Fixed curly brace indentation. * Test method is added for equals method of TaskLockPosse class.

Improper getter value is fixed. (apache#6930) (apache#7033)

da03057

* Improper getter value is fixed. * Test class is added.

Fix filterSegments for TimeBoundary and DataSourceMetadata queries (a…

91ecb84

…pache#7023) (apache#7040) * Fix filterSegments for TimeBoundary and DataSourceMetadata queries * add javadoc * fix build

fix kafka index task doesn't resume when recieve duplicate request (a…

b8a3af7

…pache#6990) (apache#7055) * fix kafka index task doesn't resume when recieve duplicate request * add unit test

Add router conf to assembly.xml (apache#7051) (apache#7060)

c9fc44f

Update api-reference.md doc (apache#7065) (apache#7078)

65227c6

- moving description of coordinator isLeader endpoint

Missing Overlord and MiddleManager api docs (apache#7042) (apache#7106)

2f4acf5

* document middle manager api * re-arrange * correction * document more missing overlord api calls, minor re-arrange of some code i was referencing * fix it * this will fix it * fixup * link to other docs

Add an api to get all lookup specs (apache#7025) (apache#7102)

bd8fb92

* Add an api to get all lookup specs * add doc

Add doc for Hadoop-based ingestion vs Native batch ingestion (apache#…

510de41

…7044) (apache#7103) * Add doc for Hadoop-based ingestion vs Native batch ingestion * add links * add links

Fix and improve doc for partitioning of local index (apache#7064) (ap…

aae0f10

…ache#7104)

add note on consistency of results for sys.segments queries (apache#7034

9990e04

) (apache#7101) * add doc * change docs * PR comments * few more changes

Improve compaction tutorial to demonstrate compaction with keepSegmen…

7ea2552

…tGranularity = true (apache#7079) (apache#7163) * Improve compaction tutorial to demonstrate compaction with keepSegmentGranularity = true * typo * add a warning

Exclude hadoop-lzo from thrift-extensions build (apache#7151) (apache…

4004328

…#7162)

Add web consoles doc page (apache#7123) (apache#7168)

43fe265

* Add web consoles doc page * PR comments * Remove 'unified' * PR comments * Fix TOC * PR comments * More revisions * GUI -> UI * Update router docs * Reword router doc

Update tutorials for 0.14.0-incubating (apache#7157) (apache#7167)

d9ded51

Fix exception when the scheme is missing in endpointUrl for S3 (apach…

0c44b8f

…e#7129) (apache#7166) * Fix exception when the scheme is missing in endpointUrl for S3 * add null check

Downgrade blueprintjs version in the web console to one with a vanill…

6d8ed7a

…a Apache 2.0 license (apache#7139) (apache#7141) * revert bp * fix tests * move @types/hjson to dev dep * removed all the package upgrades

Update handlebars dep to patch vulnerability (apache#7083) (apache#7164)

8ee9747

segment metadata fallback analysis if no bitmaps (apache#7116) (apach…

9290574

…e#7146) * segment metadata fallback analysis if no bitmaps * remove accidental line * remove nonsense size estimation * less ternary * fix it * do the thing

downgrade react-table (apache#7170) (apache#7173)

161176d

Fix supported file formats for Hadoop vs Native batch doc (apache#7069)…

7944bfc

… (apache#7177) * Fix supported file formats * address comment

Reword 'node' to 'process' (apache#7172) (apache#7178)

bb45003

add license checker to web-console (apache#7028) (apache#7191)

d9e9f74

* add license checker to web-console to ensure npm dependencies are apache license compatible * add generate licenses file * update check to remove excludes due to blueprintjs downgrade

jon-wei and others added 15 commits March 4, 2019 20:47

Update LICENSE and NOTICE files (apache#7026) (apache#7194)

a6abd04

* Update LICENSE and NOTICE files * Update react-table version

Improve doc for auto compaction (apache#7117) (apache#7190)

cb4227d

* Improve doc for auto compaction * fix doc * address comments

Add more approximate algorithm docs (apache#7195) (apache#7196)

62a655a

Reduce default max # of subTasks to 1 for native parallel task (apach…

9058e68

…e#7181) (apache#7200) * Reduce # of max subTasks to 2 * fix typo and add more doc * add more doc and link * change default and add warning * fix doc * add test * fix it test

Densify swapped hll buffer (apache#6865) (apache#7204)

8ed284d

* Densify swapped hll buffer * Make test loop limit pre-increment * Reformat * Fix test comments

[maven-release-plugin] prepare release druid-0.14.0-incubating

18c6fb0

[maven-release-plugin] prepare for next development iteration

d7eade4

Exclude non-source files from src assembly (apache#7235) (apache#7237)

3994dcf

* Exclude node_modules from src assembly * Remove git.version exclusion * Include binary LICENSE/NOTICE in source assembly

[Backport] Fix record validation in SeekableStreamIndexTaskRunner (ap…

ca4ee7a

…ache#7261) * Fix record validation in SeekableStreamIndexTaskRunner * fix validation

fix SequenceMetadata deserialization (apache#7256) (apache#7262)

49796a9

* wip * fix tests, stop reading if we are at end offset * fix build * remove restore at end offsets fix in favor of a separate PR * use typereference from method for serialization too

Fix KafkaRecordSupplier assign (apache#7260) (apache#7265)

acdda58

* Fix KafkaRecordSupplier assign * TeamCity fix

Fix testIncrementalHandOffReadsThroughEndOffsets in Kafka/KinesisInde…

3e47fa3

…xTaskTest (apache#7264) (apache#7270) * Fix testIncrementalHandOffReadsThroughEndOffsets in Kafka/KinesisIndexTaskTest * revert unnecessary change * fix test * remove debug log

clintropolis closed this Mar 15, 2019

clintropolis deleted the backport-7267-to-0.14.0-incubating branch April 5, 2019 01:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backport] Logic adjustments to SeekableStreamIndexTaskRunner.#7271

[Backport] Logic adjustments to SeekableStreamIndexTaskRunner.#7271
clintropolis wants to merge 45 commits intoapache:masterfrom
clintropolis:backport-7267-to-0.14.0-incubating

clintropolis commented Mar 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

clintropolis commented Mar 15, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants