KAFKA-8340, KAFKA-8819: Use PluginClassLoader while statically initializing plugins#7315
KAFKA-8340, KAFKA-8819: Use PluginClassLoader while statically initializing plugins#7315rhauch merged 36 commits intoapache:trunkfrom
Conversation
* Add test/resources/test-plugins with source for test plugins * Add TestPlugins for building these test plugins for unit tests * Change Plugins PluginsTest and DelegatingClassLoaderTest to include TestPlugins * Add test cases for addressing isolation bugs * Change PluginsTest to isolate the Plugins instance Signed-off-by: Greg Harris <gregh@confluent.io>
1cfe9a7 to
a9790be
Compare
Signed-off-by: Greg Harris <gregh@confluent.io>
a9790be to
76fa4f5
Compare
|
cc @rhauch @kkonstantine @C0urante for review |
C0urante
left a comment
There was a problem hiding this comment.
Great work, @gharris1727! Don't let the number of comments fool you, this looks pretty good. Mostly have questions/suggestions about testing methodology.
|
Also, can you update the description to include a list of feature branches that this change should be backported to? |
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
|
This patch breaks the build entirely, it can't even complete most of the times: |
I don't know what branches this can be backported to, how can I tell? |
Which AK versions are affected by the bug that this PR aims to address? If I recall correctly, it's been present since plugin isolation was added to Kafka; if that's the case, then we'll want to backport these changes to every feature branch that includes plugin isolation. |
Signed-off-by: Greg Harris <gregh@confluent.io>
…plugin Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
C0urante
left a comment
There was a problem hiding this comment.
Few more comments on the recent changes
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
This reverts commit bfeb20f.
|
@gharris1727 none of the failures with the 51c6f57 build came from Connect; it's pretty unlikely the |
C0urante
left a comment
There was a problem hiding this comment.
Looking really good, just one more comment from the latest changes.
|
I think I've contributed as much as I can as a reviewer and, apart from the following open items, LGTM:
Nice job, @gharris1727! |
kkonstantine
left a comment
There was a problem hiding this comment.
Thanks @gharris1727
I did a first pass during which I focused only on the actual code (no testing yet) and I've commented on a few things on the periphery of your fix. Addressing those first might help focusing on the actual changes.
I'll follow up with a second pass on the core of your fix and I'll also review the new tests, that seem quite valuable overall.
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
…ted. Signed-off-by: Greg Harris <gregh@confluent.io>
Signed-off-by: Greg Harris <gregh@confluent.io>
|
I found another bug where all of the plugins that are found with a I think this brings us up to 6 additional call sites where we swap out the |
kkonstantine
left a comment
There was a problem hiding this comment.
Lot's of commits after my first reviews. I'm retracting my approval and will have to take another look
|
@gharris1727 wrote:
I agree. Let's not refactor this and focus on a minimal-sized fix that we can backport. Also note that we started using Java 8 in AK 2.0, so we need to avoid using lambdas and function refs in the PR unless we want to create an alternative PR for 1.x and 0.x branches. |
kkonstantine
left a comment
There was a problem hiding this comment.
Performed another pass after the new commits were added.
This LGTM now.
rhauch
left a comment
There was a problem hiding this comment.
Thanks, @gharris1727. This is good stuff and quite thorough. I have a few suggestions for this PR for merging into trunk and 2.4, but I think it's going to be challenging to backport this PR as-is beyond 2.0 because of some of the features that were introduced since 2.0 and because we use Java 8 only starting with 2.0.
Having said that, let's focus on the things that we can fix immediately, try to backport this as far back as feasible, and then deal with earlier branches via separate PRs targeted to specific branches (which we'll try to backport as far back as possible, etc.).
| log.debug("Configuring the header converter with configuration keys:{}{}", System.lineSeparator(), converterConfig.keySet()); | ||
| plugin.configure(converterConfig); | ||
|
|
||
| HeaderConverter plugin; |
There was a problem hiding this comment.
HeaderConverter and this method don't exist prior to AK 1.1.
| Map<String, Object> configProviderConfig = config.originalsWithPrefix(configPrefix); | ||
| plugin.configure(configProviderConfig); | ||
|
|
||
| ConfigProvider plugin; |
There was a problem hiding this comment.
ConfigProvider was introduced in AK 2.0, so this change can't be backported to earlier branches.
…lass Signed-off-by: Greg Harris <gregh@confluent.io>
rhauch
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the quick turnaround, @gharris1727! This is an excellent PR.
…lizing plugins (#7315) Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed. KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues. Author: Greg Harris <gregh@confluent.io> Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
…lizing plugins (#7315) Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed. KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues. Author: Greg Harris <gregh@confluent.io> Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
…lizing plugins (#7315) Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed. KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues. Author: Greg Harris <gregh@confluent.io> Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
…lizing plugins (#7315) Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed. KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues. Author: Greg Harris <gregh@confluent.io> Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
…lizing plugins (#7315) Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed. KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues. Author: Greg Harris <gregh@confluent.io> Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
* KAFKA-8649: send latest commonly supported version in assignment (apache#7425) Reviewers: Matthias J. Sax <matthias@confluent.io> * KAFKA-8262, KAFKA-8263: Fix flaky test `MetricsIntegrationTest` (apache#6922) (apache#7457) - Timeout occurred due to initial slow rebalancing. - Added code to wait until `KafkaStreams` instance is in state RUNNING to check registration of metrics and in state NOT_RUNNING to check deregistration of metrics. - I removed all other wait conditions, because they are not needed if `KafkaStreams` instance is in the right state. Reviewers: Guozhang Wang <wangguoz@gmail.com> * MINOR: clarify wording around fault-tolerant state stores (apache#7510) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io> * MINOR: Port changes from trunk for test stability to 2.3 branch (apache#7424) Reviewer: Matthias J. Sax <matthias@confluent.io> * KAFKA-9014: Fix AssertionError when SourceTask.poll returns an empty list (apache#7491) Author: Konstantine Karantasis <konstantine@confluent.io> Reviewer: Randall Hauch <rhauch@gmail.com> * KAFKA-8945/KAFKA-8947: Fix bugs in Connect REST extension API (apache#7392) Fix bug in Connect REST extension API caused by invalid constructor parameter validation, and update integration test to play nicely with Jenkins Fix instantiation of TaskState objects by Connect framework. Author: Chris Egerton <chrise@confluent.io> Reviewers: Magesh Nandakumar <mageshn@confluent.io>, Randall Hauch <rhauch@gmail.com> * KAFKA-8340, KAFKA-8819: Use PluginClassLoader while statically initializing plugins (apache#7315) Added plugin isolation unit tests for various scenarios, with a `TestPlugins` class that compiles and builds multiple test plugins without them being on the classpath and verifies that the Plugins and DelegatingClassLoader behave properly. These initially failed for several cases, but now pass since the issues have been fixed. KAFKA-8340 and KAFKA-8819 are closely related, and this fix corrects the problems reported in both issues. Author: Greg Harris <gregh@confluent.io> Reviewers: Chris Egerton <chrise@confluent.io>, Magesh Nandakumar <mageshn@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com> * MINOR: log reason for fatal error in locking state dir (apache#7534) Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io> * KAKFA-8950: Fix KafkaConsumer Fetcher breaking on concurrent disconnect (apache#7511) The KafkaConsumer Fetcher can sometimes get into an invalid state where it believes that there are ongoing fetch requests, but in fact there are none. This may be caused by the heartbeat thread concurrently handling a disconnection event just after the fetcher thread submits a request which would cause the Fetcher to enter an invalid state where it believes it has ongoing requests to the disconnected node but in fact it does not. This is due to a thread safety issue in the Fetcher where it was possible for the ordering of the modifications to the nodesWithPendingFetchRequests to be incorrect - the Fetcher was adding it after the listener had already been invoked, which would mean that pending node never gets removed again. This PR addresses that thread safety issue by ensuring that the pending node is added to the nodesWithPendingFetchRequests before the listener is added to the future, ensuring the finally block is called after the node is added. Reviewers: Tom Lee, Jason Gustafson <jason@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com> * Fix bug in AssignmentInfo#encode and add additional logging (apache#7545) Same as apache#7537 but targeted at 2.3 for cherry-pick Reviewers: Bill Bejeck <bbejeck@gmail.com> * Bump version to 2.3.1 * Update versions to 2.3.2-SNAPSHOT * HOTFIX: fix bug in VP test where it greps for the wrong log message (apache#7643) Reviewers: Guozhang Wang <wangguoz@gmail.com>
During plugin class loading, instance creation, and configuration
change the current thread's context ClassLoader
to the PluginClassLoader. This prevents issues where
static initialization loads the wrong resources, or
fails to find resources that are inside the plugin.
This change impacts the following classes:
Plugins, DelegatingClassLoader, Worker.
In order to verify the original bug and verify the fix,
a group of test plugins for collecting data are included in this commit:
AlwaysThrowExceptionshows that exceptions are capturedSamplingis for basic initialization and field aliasingSamplingConvertertakes samples during converter method callsSamplingHeaderConvertertakes samples during header converter method callsSamplingConfigProvidertakes samples during config provider method callsSamplingConfigurabletakes samples during configurable method callsServiceLoaderPluginsamples the active classloader for service loaded classesThese plugins are stored in the resources directory.
Each plugin has it's own source tree, and can have arbitrary code.
The plugins are compiled into class files in a separate temporary directory.
They are then packaged into a jar in a temporary file
which is deleted after the process exits.
This jar is then exposed for use in
plugin.path, and testscan access to the plugin class names through constants.
These changes will be backported to all versions of AK which have plugin isolation (introduced in
0.11.0.0):0.11.0.41.0.31.1.22.0.22.1.22.2.22.3.12.4Committer Checklist (excluded from commit message)