[QTL] Move kafka-extraction-namespace to the Lookup framework.#2800
[QTL] Move kafka-extraction-namespace to the Lookup framework.#2800drcrallen merged 12 commits intoapache:masterfrom
Conversation
actual problem, fixing |
a922358 to
90f4e4d
Compare
|
|
||
| private final Object startStopLock = new Object(); | ||
| private final ListeningExecutorService executorService; | ||
| private final AtomicLong doubleEventCount = new AtomicLong(0L); |
There was a problem hiding this comment.
why do we maintain doubleEventCount instead of eventCount ?
There was a problem hiding this comment.
I was trying to have a simple way to minimize race conditions and locking. I could do a read/write lock if this isn't good enough.
Basically, before and after the critical section the count is increased (as opposed to just doing before or just doing after).
The crux revolves around "What was the state of the map that produced this result" for computing cache key. Which, for a continuously mutable map is a little tricky.
|
could you summarize all the changes in this PR in the description ? |
| |`druid.query.rename.kafka.properties`|A json map of kafka consumer properties. See below for special properties.|See below| | ||
|
|
||
| The following are the handling for kafka consumer properties in `druid.query.rename.kafka.properties` | ||
| The consumer properties `group.id` and `auto.offset.reset` CANNOT be set in `kafkaProperties` as they are set by the extension as `UUID.randomUUID().toString()` and `smallest` respectively. |
There was a problem hiding this comment.
i am curious what is the implication of this constraint ?
There was a problem hiding this comment.
auto.offeset.reset as smallest means "read all the data available in the topic" otherwise two different servers could replay different changelogs.
group.id means every instance is a unique consumer, so they should be accounted for as different consumers.
|
Merge problems from introspection PR, fixing |
b9b1685 to
4ca0c26
Compare
|
@b-slim I'm trying to hammer out some tests to prevent racy-ness, but otherwise this should be done. |
| private final String factoryId = UUID.randomUUID().toString(); | ||
| private final AtomicReference<Map<String, String>> mapRef = new AtomicReference<>(null); | ||
|
|
||
| private AtomicBoolean started = new AtomicBoolean(false); |
There was a problem hiding this comment.
can we use this for startStop lock as well ?
There was a problem hiding this comment.
that is possible yes, changed
|
@drcrallen can you explain how the caching mechanism works ? |
|
@b-slim this impl fills the cache with all entries without eviction. |
|
@b-slim this does not change caching mechanism from prior impl I don't understand why caching would be a blocker for this PR |
|
@drcrallen i know that, my comments it to help the user understand how thing works by adding it to the docs. IMHO it is a serious limitation to mention. |
|
@drcrallen i am not blocking it, just update the DOCs to reflect the limitation. |
|
👍 after Docs and squash. |
|
Will add docs very shortly |
|
@b-slim added |
|
https://travis-ci.org/druid-io/druid/jobs/126506259 died in |
|
👍 , LGTM |
The major changes to the PR are as follows:
The kafka extraction namespace stuff now works off the lookup extractor factory framework. It still uses the caching mechanisms in the main lookup extension, but uses some random UUID in the cache instead of trying to keep some sort of relationship to the lookup name (which is enforced by the Lookup framework in core).
This means the kafka-extraction-namespace must go through the main lookup framework.