Skip to content

Fix NPE while loading lookups from empty JDBC source#16307

Merged
kfaraz merged 3 commits intoapache:masterfrom
Akshat-Jain:fix-npe-for-empty-jdbc-lookup-source
Apr 18, 2024
Merged

Fix NPE while loading lookups from empty JDBC source#16307
kfaraz merged 3 commits intoapache:masterfrom
Akshat-Jain:fix-npe-for-empty-jdbc-lookup-source

Conversation

@Akshat-Jain
Copy link
Copy Markdown
Contributor

Description

Currently, when we add a lookup with the lookupExtractorFactory's type as cachedNamespace configured with a JDBC source, we run into NPE while loading the lookup if the configured JDBC source table has no rows.

Sample NPE logs:

2024-04-17T09:44:32,936 ERROR [NamespaceExtractionCacheManager-0] org.apache.druid.server.lookup.namespace.cache.CacheScheduler - Failed to update namespace [JdbcExtractionNamespace{connectorConfig=DbConnectorConfig{createTables=true, connectURI='jdbc:postgresql://localhost:5432/druid', user='druid', passwordProvider=org.apache.druid.metadata.DefaultPasswordProvider, dbcpProperties=null}, table='pglookuptable1', keyColumn='country_id', valueColumn='country_name', tsColumn='timeColumn', filter='null', pollPeriod=PT1M, jitterSeconds=0, loadTimeoutSeconds=120, maxHeapPercentage=10}] : org.apache.druid.server.lookup.namespace.cache.CacheScheduler$EntryImpl@64c4522
java.lang.NullPointerException: null
	at org.apache.druid.server.lookup.namespace.JdbcCacheGenerator.lastUpdates(JdbcCacheGenerator.java:219) ~[?:?]
	at org.apache.druid.server.lookup.namespace.JdbcCacheGenerator.generateCache(JdbcCacheGenerator.java:72) ~[?:?]
	at org.apache.druid.server.lookup.namespace.JdbcCacheGenerator.generateCache(JdbcCacheGenerator.java:48) ~[?:?]
	at org.apache.druid.server.lookup.namespace.cache.CacheScheduler$EntryImpl.tryUpdateCache(CacheScheduler.java:236) ~[?:?]
	at org.apache.druid.server.lookup.namespace.cache.CacheScheduler$EntryImpl.updateCache(CacheScheduler.java:208) ~[?:?]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.base/java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305) [?:?]
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java) [?:?]
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:829) [?:?]

This PR handles the NPE, and adds a test for validating the corresponding flow.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes lgtm.
Thanks for your first pr in apache druid.
🚀

.first()
);
if (update == null) {
LOG.info("Lookup table is empty. No rows returned for the query: %s", query);
Copy link
Copy Markdown
Contributor

@cryptoe cryptoe Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add the look up name here.

Copy link
Copy Markdown
Contributor Author

@Akshat-Jain Akshat-Jain Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cryptoe The cache scheduler layer doesn't seem to have lookup name information. It only has the JdbcExtractionNamespace level information - which is what gets logged everywhere else in the cache scheduler layer as well.

Technically we can add an extra method param to send lookup name info all the way down to the cache layer through the whole method chain, but that seems unnecessary to me. That would also then require us to change logging everywhere else in the cache layer, just to ensure consistency - so it would end up as a much wider change.

Thoughts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this:

Suggested change
LOG.info("Lookup table is empty. No rows returned for the query: %s", query);
LOG.info("Lookup table[%s] is empty. No rows returned for the query[%s].", table, query);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfaraz Have made the suggested change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't INFO logging be too noisy, given that this will log every time we run a check (to see whether to update the table or not)? (By default, it is set to not poll, so the default behaviour is cool)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LakshSingla This is going to be logged only if the table is empty, so I think it should be fine.

@kfaraz kfaraz merged commit 79e48c6 into apache:master Apr 18, 2024
@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Apr 18, 2024

Thanks for the fix, @Akshat-Jain !

@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants