Skip to content

can reset kafka Supervisor to offset by specified time#10745

Closed
kaijianding wants to merge 2 commits intoapache:masterfrom
kaijianding:resetToTime
Closed

can reset kafka Supervisor to offset by specified time#10745
kaijianding wants to merge 2 commits intoapache:masterfrom
kaijianding:resetToTime

Conversation

@kaijianding
Copy link
Copy Markdown
Contributor

Description

Currently, druid can only reset to earliest/latest offset, sometimes, users want to read from a specified time like the start of today

A new optional param timestamp is added to api POST /druid/indexer/v1/supervisor/<supervisorId>/reset?timestamp=<timestamp in millisencond>

Add Map<PartitionIdType, SequenceOffsetType> getPositionFromTime(long offsetTime); to get offsets from time
Only kafka supports this feature

Add OverrideNotice to replace the whole DataSourceMetadata in metastore and kill tasks to let tasks restart and read starting from the new DatasourceMetadata(the same logic as ResetNotice)


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Key changed/added classes in this PR
  • KafkaRecordSupplier
  • IndexerSQLMetadataStorageCoordinator
  • SeekableStreamSupervisor.java

@lgellis
Copy link
Copy Markdown

lgellis commented Jan 26, 2021

This would be a very helpful feature for our use case as well!

@asdf2014
Copy link
Copy Markdown
Member

Hi, @kaijianding . This PR looks great! Could you please help to solve the CI problem, thanks a lot 👍

@kaijianding
Copy link
Copy Markdown
Contributor Author

@asdf2014 sure, will add some tests to pass CI

@stale
Copy link
Copy Markdown

stale Bot commented Apr 28, 2022

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale
Copy link
Copy Markdown

stale Bot commented Mar 25, 2023

This pull request/issue is no longer marked as stale.

@stale stale Bot removed the stale label Mar 25, 2023
@abhishekrb19
Copy link
Copy Markdown
Contributor

@kaijianding, thank you for this patch! Would you be interested in picking this up again and getting it through the finish line? If not, do you mind if someone else picks it up?

A few high-level comments:

  1. The proposed POST API with an optional query parameter seems a bit odd - /druid/indexer/v1/supervisor/<supervisorId>/reset?timestamp=<timestamp in millisencond>. Can we perhaps add a new endpoint for this operation that accepts a body instead of a query parameter? This will be also in line with the reset offsets supervisor API that was recently added.
  2. Kinesis supports reading records from AT_TIMESTAMP starting position. So the reset from timestamp operation should be possible for Kinesis streaming as well. This can be in a follow up change.
  3. Similar to per-partition offsets, I wonder if per-partition timestamp will be useful - especially for streams with non-uniform/late arriving traffic patterns on a few partitions? Let me know what you think.

@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions Bot added the stale label Oct 24, 2023
@github-actions
Copy link
Copy Markdown

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions Bot closed this Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants