Skip to content

update docs for kafka lookup extension to specify correct extension loading requirements#16929

Closed
georgew5656 wants to merge 6 commits intoapache:masterfrom
georgew5656:updateKafkaLookupDocs
Closed

update docs for kafka lookup extension to specify correct extension loading requirements#16929
georgew5656 wants to merge 6 commits intoapache:masterfrom
georgew5656:updateKafkaLookupDocs

Conversation

@georgew5656
Copy link
Copy Markdown
Contributor

It's not necessary to load both druid-lookups-cached-global and druid-kafka-extraction-namespace to use kafka lookups. In fact if you happen to load both in the wrong order (global lookups before kafka lookups), it actually breaks kafka lookups (see: #3538).

Background
the druid-kafka-extraction-namespace extension has a dependency on druid-lookups-cached-global, so when loading the extension, both modules ( and ) are included.

So when druid-kafka-extraction-namespace is included as a extension, both modules are loaded and the features for both extensions are loaded (kafka lookups and global cached lookups)

There is logic in ExtensionsLoader.tryAdd that checks whether a module has already been loaded during initialization and skips it if it already has been loaded.

This is a problem when both druid-kafka-extraction-namespace and druid-lookups-cached-global are specified because they both load NamespaceExtractionModule.

If druid-kafka-extraction-namespace is specified first, both NamespaceExtractionModule and KafkaExtractionNamespaceModule are loaded by the druid-kafka-extraction-namespace classloader, and the druid-lookups-cached-global classloader doesn't load anything since NamespaceExtractionModule was already loaded. This is fine because the features of druid-lookups-cached-global are served through the module NamespaceExtractionModule being loaded in druid-kafka-extraction-namespace. (this is essentially the same behavior as just loading druid-kafka-extraction-namespace, and this is why loading both extensions in this order works)

If druid-lookups-cached-global is specified first, NamespaceExtractionModule is loaded by the druid-lookups-cached-global class loader. The druid-kafka-extraction-namespace classloader will only load KafkaExtractionNamespaceModule because NamespaceExtractionModule has already been loaded. This is a problem because kafka lookups rely on classes bound in NamespaceExtractionModule that it can't access (because NamespaceExtractionModule is only bound in the druid-lookups-cached-global classloader). this is the cause of the linked bug

Description

IMO the current behavior is a little weird (loading one extension essentially loads two extensions), but to fix this we would have to either move the shared code into druid core or allow first class extension dependencies. For now i think it makes sense to update the documentation to specify that you should only ever load one of the two extensions.

I have tested this behavior in a druid cluster (loading just the kafka extension and trying to create both a kafka lookup and a cachedNamespace lookup)

Release note

Update documentation for enabling kafka and globally cached lookups

Key changed/added classes in this PR
  • docs/querying/kafka-extraction-namespace.md
  • docs/querying/lookups-cached-global.md

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@arunramani
Copy link
Copy Markdown
Contributor

Is there a known issues part of the Druid docs? You could also document the error and provide the same explainer there.

Copy link
Copy Markdown
Contributor

@techdocsmith techdocsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style changes.

Comment thread docs/querying/kafka-extraction-namespace.md Outdated
Comment thread docs/querying/lookups-cached-global.md Outdated
georgew5656 and others added 3 commits August 22, 2024 12:40
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
Co-authored-by: Charles Smith <techdocsmith@gmail.com>
@georgew5656
Copy link
Copy Markdown
Contributor Author

i think i found a better way to fix this bug, so i'm gonna close this docs pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants