Skip to content

Conversation

@lhotari
Copy link
Member

@lhotari lhotari commented Feb 19, 2021

Fixes #9572

Motivation

See #9572 for problem description. The total size of Pulsar IO files is currently 1952MB. The changes in #9246 made the problem worse, but the root cause of the problem is not #9246 . The Pulsar IO Connectors contained unnecessary files already before that change. This PR doesn't intend to fix the problems of leaked dependencies that is an issue caused by #9246 . There's a separate issue #9640 for handling the regression that #9246 caused.

It is safe to exclude all dependencies that are part of Pulsar Functions Worker's system classloader. The reason for this is that classloaders use parent-first lookups (by default, and also in Pulsar Functions Worker). The simplest way to do this exclusion is to use the provided scope when applicable.

Reducing the Pulsar IO files size is necessary to get the size of the pulsar-test-latest-version docker image size reduced so that it is feasible to transfer the docker image over the network in order to build the image once in the Pulsar CI GitHub Actions workflow and reuse the image in the integration tests that use that image.

Modifications

  • Use provided scope when applicable to exclude the dependencies which are part of Pulsar Functions Worker's system classloader.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as far as CI passes

@merlimat
Copy link
Contributor

merlimat commented Feb 19, 2021

Reducing the Pulsar IO files size is necessary to get the size of the pulsar-test-latest-version docker image size reduced so that it is feasible to transfer the docker image over the network in order to build the image once in the Pulsar CI GitHub Actions workflow and reuse the image in the integration tests that use that image.

Not withstanding that reducing size is good, one thing we discussed yesterday, around moving connectors to separate repo, was to have the integration tests to just build a minimal image (not like pulsar-all) and use that for tests (as you already did in #9625). That image will be able to get cached within GH actions, because it won't contain all the connectors.

@lhotari
Copy link
Member Author

lhotari commented Feb 22, 2021

/pulsarbot run-failure-checks

3 similar comments
@lhotari
Copy link
Member Author

lhotari commented Feb 22, 2021

/pulsarbot run-failure-checks

@lhotari
Copy link
Member Author

lhotari commented Feb 22, 2021

/pulsarbot run-failure-checks

@lhotari
Copy link
Member Author

lhotari commented Feb 22, 2021

/pulsarbot run-failure-checks

Copy link
Contributor

@freeznet freeznet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari
Copy link
Member Author

lhotari commented Feb 23, 2021

/pulsarbot run-failure-checks

2 similar comments
@lhotari
Copy link
Member Author

lhotari commented Feb 23, 2021

/pulsarbot run-failure-checks

@lhotari
Copy link
Member Author

lhotari commented Feb 23, 2021

/pulsarbot run-failure-checks

@lhotari lhotari force-pushed the lh-reduce-pulsar-io-connector-size branch from 7d27d52 to 4cf685e Compare February 23, 2021 17:17
@lhotari
Copy link
Member Author

lhotari commented Feb 23, 2021

/pulsarbot run-failure-checks

Copy link
Contributor

@jerrypeng jerrypeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safe to exclude all dependencies that are part of Pulsar Functions Worker's system classloader.

This is true for ThreadRuntime currently but not for ProcessRuntime / kubernetes runtijme. ThreadRuntime is also not doing the right thing since Ideally user provided code is isolated on completely separate classloader that only share common interfaces with the framework classloader.

While I am all for decreasing the sizes of connectors. This will have other consequences.

@lhotari
Copy link
Member Author

lhotari commented Feb 24, 2021

/pulsarbot run-failure-checks

3 similar comments
@lhotari
Copy link
Member Author

lhotari commented Feb 24, 2021

/pulsarbot run-failure-checks

@lhotari
Copy link
Member Author

lhotari commented Feb 24, 2021

/pulsarbot run-failure-checks

@lhotari
Copy link
Member Author

lhotari commented Feb 24, 2021

/pulsarbot run-failure-checks

@lhotari
Copy link
Member Author

lhotari commented Feb 24, 2021

This is true for ThreadRuntime currently but not for ProcessRuntime / kubernetes runtijme. ThreadRuntime is also not doing the right thing since Ideally user provided code is isolated on completely separate classloader that only share common interfaces with the framework classloader.

@jerrypeng Thanks for pointing this out. I spent a few hours investigating this to get a better understanding how the classloaders are configured in Pulsar Functions.

I found the documentation of the classloader hierarchy for Process Runtime in JavaInstanceMain class:

* This is the initial class that gets called when starting a Java Function instance.
* Multiple class loaders are used to separate function instance dependencies from user code dependencies
* This class will create three classloaders:
* 1. The root classloader that will share interfaces between the function instance
* classloader and user code classloader. This classloader will contain the following dependencies
* - pulsar-functions-api
* - pulsar-client-api
* - log4j-slf4j-impl
* - slf4j-api
* - log4j-core
* - log4j-api
*
* 2. The Function instance classloader, a child of the root classloader, that loads all pulsar broker/worker dependencies
* 3. The user code classloader, a child of the root classloader, that loads all user code dependencies

There's also logic that impacts how the classpaths get configured:

if (StringUtils.isNotEmpty(functionInstanceClassPath)) {
args.add(String.format("-D%s=%s", FUNCTIONS_INSTANCE_CLASSPATH, functionInstanceClassPath));
} else {
// add complete classpath for broker/worker so that the function instance can load
// the functions instance dependencies separately from user code dependencies
String systemFunctionInstanceClasspath = System.getProperty(FUNCTIONS_INSTANCE_CLASSPATH);
if (systemFunctionInstanceClasspath == null) {
log.warn("Property {} is not set. Falling back to using classpath of current JVM", FUNCTIONS_INSTANCE_CLASSPATH);
systemFunctionInstanceClasspath = System.getProperty("java.class.path");
}
args.add(String.format("-D%s=%s", FUNCTIONS_INSTANCE_CLASSPATH, systemFunctionInstanceClasspath));
}

By default functionInstanceClassPath is always null/empty, for both Process Runtime and also for Kubernetes Runtime. Therefore pulsar.functions.instance.classpath will always be set to System.getProperty("java.class.path") by default.

Because of the above logic where pulsar.functions.instance.classpath will always contain the classpath of the process that launches the processes, it leads to the same behavior for thread, process and kubernetes runtimes, where the Pulsar provided dependencies will always get loaded from the parent classloader of the "user classloader". I don't see any actual classloading differences between the thread runtime, process runtime and kubernetes runtime when the default configuration is used (functionInstanceClassPath is left unconfigured in conf/functions_worker.yml) . I might be missing something, so I'd be happy to learn more about this area in Pulsar.

While I am all for decreasing the sizes of connectors. This will have other consequences.

There are integration tests which test both the process runtime and the thread runtime for some connectors. It seems like a major gap that not all connectors are tested with the thread runtime and the process runtime. You are right that there will be unknown consequences as long as there isn't test coverage for both scenarios.

As explained in the previous observations of the thread vs. process/k8s runtime (no actual difference in classloading with default settings), the changes made in this PR are safe in general.
@jerrypeng WDYT?

Use provided scope when applicable
org.apache.logging.slf4j.Log4jLogger cannot be cast to ch.qos.logback.classic.Logger
@lhotari lhotari force-pushed the lh-reduce-pulsar-io-connector-size branch from 4cf685e to 04f674c Compare February 24, 2021 18:04
@lhotari
Copy link
Member Author

lhotari commented Feb 24, 2021

btw. I noticed in some local tests that pulsar-io/flume connector is most likely broken. There aren't any integration tests to verify it's behavior. The Flume libraries depend on older Avro version which isn't compatible with the one used in Pulsar. That's why I reverted some changes I had made in pulsar-io/flume/pom.xml to mark Avro dependencies with provided .

@lhotari
Copy link
Member Author

lhotari commented Feb 25, 2021

/pulsarbot run-failure-checks

@jerrypeng
Copy link
Contributor

jerrypeng commented Mar 4, 2021

@lhotari functionInstanceClsLoader defined in JavaInstanceMain:

https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L87

Does NOT load the user's function JARs. It is suppose to load the Pulsar Function framework JARs thus it does load all of the Pulsar platform dependencies.

The root classloader that contains only the interfaces in which the user defined function interacts with framework is defined here:

https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L99

and passed into the JavaInstaceStarter here

https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L99

The root classloader is subsequently pass into the ThreadRuntimeFactory here:

https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/JavaInstanceStarter.java#L213

The ThreadRuntime will use the root classloader, which only contains those few interfaces, as the parent classloader of the user code function classloader. Thus, the classloader that loads the user function JARs will not contain all the dependencies of Pulsar. By the way ThreadRuntime is used by both ProcessRuntime and KubernetesRuntime underneath but don't be confused by this with actually configuring the worker to use ThreadRuntime.

When ThreadRuntime is configured as the runtime to be used by the worker, this root classloader will not be set and and default to Thread.currentThread().getContextClassLoader() which contains all of pulsar's dependencies:

https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/thread/ThreadRuntimeFactory.java#L91

However, this is not the case for ProcessRuntime and KubernetesRuntime and this is the difference between ThreadRuntime and the other two runtimes. Again, this is something the that ThreadRuntime is doing incorrectly, but I / we haven't gotten around to fix it.

In general, for platform that supports third party plugins or executing user submitted code, it is best if classpaths are isolated and transitive dependencies are not shared across platform and user code. This will cause a lot of dependency versioning issues and limit what versions dependencies user submitted code can use.

@lhotari I appreciate your effort to solve the issues with testing and to understand the Pulsar Function code. Perhaps we can find another solution here? Looking forward to working with you in the Pulsar community!

@lhotari
Copy link
Member Author

lhotari commented Mar 4, 2021

@jerrypeng Thanks for your reply. I'll take a closer look tomorrow and go through the details.

As long as we find a solution to get reduce the Pulsar IO Connectors size, I'm fine with that.
It seems that there should be a way to get rid of the dependencies that are Pulsar dependencies. The total size of Pulsar Connectors is about 2GB at the moment and that makes the pulsar-all image huge. StreamNative's streamnative/pulsar-all:2.8.0-rc-202103022223 is 2.92GB (compressed image size, I assume).
Do you have another solution in mind?

@jerrypeng
Copy link
Contributor

jerrypeng commented Mar 4, 2021

Also because this PR:

#9246

That separated pulsar-client-admin into two modules pulsar-client-admin and pulsar-client-admin-api. pulsar-cient-admin-api module pulls in a bunch of dependencies which doesn't make sense for a "api" module. The java-instance that gets loaded by the root classloader should only contain those few interfaces but now contain

org.apache.pulsar:pulsar-functions-runtime-all:jar:2.7.0
[INFO] +- org.apache.pulsar:pulsar-io-core:jar:2.7.0:compile
[INFO] | - (org.apache.pulsar:pulsar-functions-api:jar:2.7.0:compile - omitted for duplicate)
[INFO] +- org.apache.pulsar:pulsar-functions-api:jar:2.7.0:compile
[INFO] | +- (org.slf4j:slf4j-api:jar:1.7.25:compile - version managed from 1.7.30; omitted for duplicate)
[INFO] | +- (org.apache.pulsar:pulsar-client-api:jar:2.7.0:compile - omitted for duplicate)
[INFO] | - org.apache.pulsar:pulsar-client-admin-api:jar:2.7.0:compile
[INFO] | +- org.apache.pulsar:pulsar-common:jar:2.7.0:compile
[INFO] | | +- (org.apache.pulsar:pulsar-client-api:jar:2.7.0:compile - omitted for duplicate)
[INFO] | | +- io.swagger:swagger-annotations:jar:1.6.2:compile
[INFO] | | +- (org.slf4j:slf4j-api:jar:1.7.25:compile - version managed from 1.7.30; omitted for duplicate)
[INFO] | | +- com.fasterxml.jackson.core:jackson-databind:jar:2.11.1:compile (version managed from 2.9.9.3)
[INFO] | | | +- com.fasterxml.jackson.core:jackson-annotations:jar:2.11.1:compile
[INFO] | | | - com.fasterxml.jackson.core:jackson-core:jar:2.11.1:compile (version managed from 2.9.9)
[INFO] | | +- com.google.guava:guava:jar:30.1-jre:compile (version managed from 29.0-android)
[INFO] | | | +- com.google.guava:failureaccess:jar:1.0.1:compile
[INFO] | | | +- com.google.guava:listenablefuture:jar:9999.0-empty-to-avoid-conflict-with-guava:compile
[INFO] | | | +- com.google.code.findbugs:jsr305:jar:3.0.2:compile
[INFO] | | | +- org.checkerframework:checker-qual:jar:3.5.0:compile
[INFO] | | | +- com.google.errorprone:error_prone_annotations:jar:2.3.4:compile
[INFO] | | | - com.google.j2objc:j2objc-annotations:jar:1.3:compile
[INFO] | | +- io.netty:netty-handler:jar:4.1.51.Final:compile
[INFO] | | | +- io.netty:netty-common:jar:4.1.51.Final:compile
[INFO] | | | +- io.netty:netty-resolver:jar:4.1.51.Final:compile
[INFO] | | | | - (io.netty:netty-common:jar:4.1.51.Final:compile - omitted for duplicate)
[INFO] | | | +- io.netty:netty-buffer:jar:4.1.51.Final:compile
[INFO] | | | | - (io.netty:netty-common:jar:4.1.51.Final:compile - omitted for duplicate)
[INFO] | | | +- io.netty:netty-transport:jar:4.1.51.Final:compile
[INFO] | | | | +- (io.netty:netty-common:jar:4.1.51.Final:compile - omitted for duplicate)
[INFO] | | | | +- (io.netty:netty-buffer:jar:4.1.51.Final:compile - omitted for duplicate)
...

This is also a problem. @lhotari this is why the tests pass for the connectors even though dependencies are not packaged explicitly. @sijie that PR has become very problematic for many reasons now.

@jerrypeng
Copy link
Contributor

@lhotari if the primary purpose if to reduce the test image size, perhaps a simple solution would be just to build separate pulsar images with different connectors or sets of connectors and the integration tests can be modified to use the correct one when testing a connector.

@lhotari
Copy link
Member Author

lhotari commented Mar 4, 2021

@lhotari if the primary purpose if to reduce the test image size, perhaps a simple solution would be just to build separate pulsar images with different connectors or sets of connectors and the integration tests can be modified to use the correct one when testing a connector.

For my CI case it's the primary purpose. @merlimat suggested to write a spec as a GH issue for what you are proposing. However that doesn't resolve the problem that Pulsar users have. The pulsar-all image size is huge. Atm 3GB, but in 2.7 it's 2.25GB which is also too large. IIRC, There's about 400MB coming from duplicate connector jars (which are Pulsar core deps) also in 2.7 pulsar-all docker image.

Isn't there a simple solution to get rid of the unnecessary jar files in the .nar files?

@lhotari
Copy link
Member Author

lhotari commented Mar 4, 2021

Also because this PR:

#9246

...

This is also a problem. @lhotari this is why the tests pass for the connectors even though dependencies are not packaged explicitly. @sijie that PR has become very problematic for many reasons now.

#9640 has been reported about this issue.

@jerrypeng
Copy link
Contributor

@lhotari thanks for filling the issue.

However that doesn't resolve the problem that Pulsar users have.

Why do we have to package all the connectors into one image even for users? I doubt a user will need to use all connectors Pulsar has to offer. Perhaps we can have a strategy in which a user can choose which connectors, he or she as actually needs and that gets baked into the image. And for users that just want to use pub sub, I think there is already an image that doesn't contain any connectors.

Isn't there a simple solution to get rid of the unnecessary jar files in the .nar files?

The thing is NARs are suppose to be self contained which means all the transitive dependencies of the connector needs to be packaged in it as well. There are advantages to this as I mentioned before, however the downside is will duplicate dependencies across all connectors. However, this allows connectors evolve in a more of a vacuum than be dependent versions of libraries used by other connectors or pulsar.

@lhotari
Copy link
Member Author

lhotari commented Mar 5, 2021

Why do we have to package all the connectors into one image even for users? I doubt a user will need to use all connectors Pulsar has to offer. Perhaps we can have a strategy in which a user can choose which connectors, he or she as actually needs and that gets baked into the image. And for users that just want to use pub sub, I think there is already an image that doesn't contain any connectors.

I agree on this point.

However, because of the existing user base of pulsar-all image, it would be a breaking change from the user's perspective if support for pulsar-all is suddenly stopped without a migration path.

There is a lot of usage for pulsar-all image. For example, the pulsar-helm-chart uses the pulsar-all image in the default configuration:
https://github.com/apache/pulsar-helm-chart/blob/67818a48cb41b3e4b6d685fb598332a9d9a320bb/charts/pulsar/values.yaml#L150-L172

The thing is NARs are suppose to be self contained which means all the transitive dependencies of the connector needs to be packaged in it as well.

NARs might supposed to be self contained, but the technical implementation doesn't support this. It seems that the Pulsar dependencies are in the parent classloader of the NAR classloader in all configurations of Pulsar Functions: thread, process and k8s runtime. Because of this, any library that is part of Pulsar dependencies will get loaded from the parent classloader and not from the jar files embedded in the NAR file. That is the reason why adding any classes that are part of Pulsar dependencies is unnecessary duplication in the Pulsar connector .nar files.
This might not have been the original intention of the Pulsar Functions design.

I'll spend some more time to verify with experimenting and testing whether it is the case or not.

@jerrypeng
Copy link
Contributor

jerrypeng commented Mar 5, 2021

It seems that the Pulsar dependencies are in the parent classloader of the NAR classloader in all configurations of Pulsar Functions: thread, process and k8s runtime. Because of this, any library that is part of Pulsar dependencies will get loaded from the parent classloader and not from the jar files embedded in the NAR file.

Again, this not true or not suppose to be true. I wrote and designed the classloading mechanism in functions so at least i know the intention if for user code to be loaded in a separate classloader than doesn't contain all of pulsar dependencies. However, currently because of issue such as #9246 , the classloader for user code maybe polluted with unnecessary deps.

@jerrypeng
Copy link
Contributor

However, because of the existing user base of pulsar-all image, it would be a breaking change from the user's perspective if support for pulsar-all is suddenly stopped without a migration path.

There is a lot of usage for pulsar-all image. For example, the pulsar-helm-chart uses the pulsar-all image in the default configuration:
https://github.com/apache/pulsar-helm-chart/blob/67818a48cb41b3e4b6d685fb598332a9d9a320bb/charts/pulsar/values.yaml#L150-L172

We don't have to stop building the pulsar-all image that contains all the connectors. At the same time, we also don't need to be keep using that image for our integration tests. We can also introduce different flavors of images that contain different sets for connectors.

@merlimat
Copy link
Contributor

merlimat commented Mar 5, 2021

I already checked in #9807 and #9808 that are including only the connectors that are actually tested and squeezed some extra 100s of MB

@merlimat
Copy link
Contributor

merlimat commented Mar 5, 2021

One more: #9817

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologized for being late in reviewing this!

@lhotari I liked the motivation. I am also good with making pulsar-io-core as provided dependency. However, I see you also turn some of the dependencies shared between the Pulsar runtime and connector implementation to the provided scope. I am not comfortable with that change. Because even they are the same dependency, they will be loaded via different class loaders. I am not sure turning them into provided will not break any connector implementation.

If we really want to go down this approach, we should make changes to one connector first and make sure we have extensive integration tests to make sure we don't break the connector, before we make changes to all connectors.

@lhotari lhotari marked this pull request as draft March 8, 2021 22:14
@lhotari
Copy link
Member Author

lhotari commented Mar 9, 2021

Again, this not true or not suppose to be true. I wrote and designed the classloading mechanism in functions so at least i know the intention if for user code to be loaded in a separate classloader than doesn't contain all of pulsar dependencies. However, currently because of issue such as #9246 , the classloader for user code maybe polluted with unnecessary deps.

@jerrypeng You are right that the problem is caused by PR #9246 (open issue #9640) . I ran some tests with a k8s cluster with Pulsar 2.7.0 . There, the classloaders are properly isolated and the classloaders contain only the minimum dependencies.

Some output from my experiment:

kubectl --namespace=pulsar exec -it pf-public-default-test-pulsar-inspector-0 -- jar tvf /pulsar/instances/java-instance.jar  |grep pom.xml
  1495 Tue Dec 01 11:25:22 UTC 2020 META-INF/maven/org.apache.pulsar/pulsar-io-core/pom.xml
  1793 Tue Dec 01 11:25:22 UTC 2020 META-INF/maven/org.apache.pulsar/pulsar-functions-api/pom.xml
  2689 Tue Dec 01 11:25:22 UTC 2020 META-INF/maven/org.apache.pulsar/pulsar-client-api/pom.xml
  3833 Thu Mar 16 16:53:42 UTC 2017 META-INF/maven/org.slf4j/slf4j-api/pom.xml
  7718 Sun Nov 19 00:48:12 UTC 2017 META-INF/maven/org.apache.logging.log4j/log4j-slf4j-impl/pom.xml
 12371 Sun Nov 19 00:48:10 UTC 2017 META-INF/maven/org.apache.logging.log4j/log4j-api/pom.xml
 20371 Sun Nov 19 00:48:10 UTC 2017 META-INF/maven/org.apache.logging.log4j/log4j-core/pom.xml
  3281 Tue Dec 01 11:25:22 UTC 2020 META-INF/maven/org.apache.pulsar/pulsar-functions-runtime-all/pom.xml
cat classloader_report.json|jq -r .classloader_report
  Classloader: org.apache.pulsar.common.nar.NarClassLoader[/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked]
  identityHashCode:1618560520
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/jackson-annotations-2.11.2.jar
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/jackson-core-2.11.2.jar
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/jackson-databind-2.11.2.jar
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/pulsar-client-api-2.7.0.jar
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/pulsar-functions-api-2.7.0.jar
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/pulsar-io-core-2.7.0.jar
  file:/tmp/pulsar-nar/pulsar-inspector-connector-1.0-SNAPSHOT.nar-unpacked/META-INF/bundled-dependencies/slf4j-api-1.7.25.jar
  Parent:
    Classloader: sun.misc.Launcher$AppClassLoader@7852e922
    identityHashCode:2018699554
    file:/pulsar/instances/java-instance.jar
    file:/pulsar/instances/deps/*
    Parent:
      Classloader: sun.misc.Launcher$ExtClassLoader@4ac68d3e
      identityHashCode:1254526270
      file:/usr/local/openjdk-8/jre/lib/ext/cldrdata.jar
      file:/usr/local/openjdk-8/jre/lib/ext/localedata.jar
      file:/usr/local/openjdk-8/jre/lib/ext/sunpkcs11.jar
      file:/usr/local/openjdk-8/jre/lib/ext/nashorn.jar
      file:/usr/local/openjdk-8/jre/lib/ext/zipfs.jar
      file:/usr/local/openjdk-8/jre/lib/ext/jaccess.jar
      file:/usr/local/openjdk-8/jre/lib/ext/dnsns.jar
      file:/usr/local/openjdk-8/jre/lib/ext/sunec.jar
      file:/usr/local/openjdk-8/jre/lib/ext/sunjce_provider.jar

(btw. I created a full blown experiment in https://github.com/lhotari/pulsar-inspector-connector which created the classloader report above. I used as a tool to learn how to debug Pulsar Functions in a local k8s cluster since I didn't have the development environment setup for that previously.)

On 2.8.0-SNAPSHOT, java-instance.jar contains:

jar tvf pulsar-functions/runtime-all/target/java-instance.jar |grep pom.xml
  2122 Wed Mar 03 14:02:44 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-io-core/pom.xml
  2625 Mon Feb 15 12:51:24 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-functions-api/pom.xml
  2590 Mon Feb 15 12:28:36 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-client-admin-api/pom.xml
  6882 Wed Mar 03 14:02:44 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-common/pom.xml
  6246 Wed Jul 01 11:07:32 EEST 2020 META-INF/maven/io.swagger/swagger-annotations/pom.xml
  7375 Thu Jun 25 12:44:54 EEST 2020 META-INF/maven/com.fasterxml.jackson.core/jackson-databind/pom.xml
  3544 Thu Jun 25 12:29:14 EEST 2020 META-INF/maven/com.fasterxml.jackson.core/jackson-annotations/pom.xml
  4860 Thu Jun 25 12:34:42 EEST 2020 META-INF/maven/com.fasterxml.jackson.core/jackson-core/pom.xml
 19342 Thu Jul 09 12:44:34 EEST 2020 META-INF/maven/io.netty/netty-transport-native-unix-common/pom.xml
  5208 Mon Oct 14 19:00:34 EEST 2019 META-INF/maven/io.airlift/aircompressor/pom.xml
  1820 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-codec-haproxy/pom.xml
  4084 Fri Nov 20 15:05:42 EET 2020 META-INF/maven/org.eclipse.jetty/jetty-util/pom.xml
  2278 Thu Jun 25 14:09:32 EEST 2020 META-INF/maven/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/pom.xml
 37848 Fri Feb 28 09:06:20 EET 2020 META-INF/maven/org.yaml/snakeyaml/pom.xml
 30198 Fri Aug 04 15:01:00 EEST 2017 META-INF/maven/javax.ws.rs/javax.ws.rs-api/pom.xml
  7155 Wed Mar 03 14:02:44 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-client-original/pom.xml
  2910 Wed Mar 03 14:02:44 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-transaction-common/pom.xml
  4775 Fri Feb 14 21:06:54 EET 2020 META-INF/maven/com.google.protobuf/protobuf-java-util/pom.xml
  2609 Mon Feb 15 12:26:12 EET 2021 META-INF/maven/org.apache.pulsar/bouncy-castle-bc/pom.xml
  2402 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-codec-http/pom.xml
  3000 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-resolver-dns/pom.xml
  2143 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-codec-dns/pom.xml
  2701 Wed Apr 08 07:38:40 EEST 2020 META-INF/maven/org.asynchttpclient/async-http-client/pom.xml
   757 Wed Apr 08 07:38:40 EEST 2020 META-INF/maven/org.asynchttpclient/async-http-client-netty-utils/pom.xml
  2150 Mon Nov 18 22:45:48 EET 2019 META-INF/maven/com.typesafe.netty/netty-reactive-streams/pom.xml
  6515 Fri Sep 01 16:13:04 EEST 2017 META-INF/maven/com.sun.activation/javax.activation/pom.xml
  2183 Sun Nov 20 15:37:30 EET 2016 META-INF/maven/com.yahoo.datasketches/sketches-core/pom.xml
   807 Sun Nov 20 15:37:30 EET 2016 META-INF/maven/com.yahoo.datasketches/memory/pom.xml
  2527 Fri Oct 04 11:54:30 EEST 2019 META-INF/maven/com.google.code.gson/gson/pom.xml
  6380 Wed Aug 28 09:16:18 EEST 2019 META-INF/maven/org.apache.avro/avro/pom.xml
 18227 Sat Aug 24 13:29:46 EEST 2019 META-INF/maven/org.apache.commons/commons-compress/pom.xml
  4057 Wed Aug 28 09:16:18 EEST 2019 META-INF/maven/org.apache.avro/avro-protobuf/pom.xml
  2949 Thu Jun 25 13:58:40 EEST 2020 META-INF/maven/com.fasterxml.jackson.module/jackson-module-jsonSchema/pom.xml
  7860 Wed Apr 10 15:02:26 EEST 2013 META-INF/maven/javax.validation/validation-api/pom.xml
  2673 Mon Feb 15 12:26:12 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-package-core/pom.xml
  2225 Mon Feb 15 12:26:12 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-client-api/pom.xml
  3833 Thu Mar 16 16:53:42 EET 2017 META-INF/maven/org.slf4j/slf4j-api/pom.xml
 11507 Fri Nov 06 14:02:20 EET 2020 META-INF/maven/org.apache.logging.log4j/log4j-slf4j-impl/pom.xml
 14045 Fri Nov 06 14:02:18 EET 2020 META-INF/maven/org.apache.logging.log4j/log4j-api/pom.xml
 23342 Fri Nov 06 14:02:18 EET 2020 META-INF/maven/org.apache.logging.log4j/log4j-core/pom.xml
 12208 Mon Dec 14 10:51:40 EET 2020 META-INF/maven/com.google.guava/guava/pom.xml
  2413 Mon Nov 19 12:30:00 EET 2018 META-INF/maven/com.google.guava/failureaccess/pom.xml
  2278 Tue Jan 01 03:00:00 EET 1980 META-INF/maven/com.google.guava/listenablefuture/pom.xml
  4286 Fri Mar 31 10:21:36 EEST 2017 META-INF/maven/com.google.code.findbugs/jsr305/pom.xml
  2111 Mon Dec 02 11:00:20 EET 2019 META-INF/maven/com.google.errorprone/error_prone_annotations/pom.xml
  2762 Wed Jan 18 15:09:46 EET 2017 META-INF/maven/com.google.j2objc/j2objc-annotations/pom.xml
  7575 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-common/pom.xml
  4193 Fri Jan 03 10:59:38 EET 2020 META-INF/maven/org.jctools/jctools-core/pom.xml
  2247 Thu Feb 18 00:45:14 EET 2021 META-INF/maven/org.apache.bookkeeper/bookkeeper-common-allocator/pom.xml
  1578 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-buffer/pom.xml
  5564 Fri Feb 14 21:06:54 EET 2020 META-INF/maven/com.google.protobuf/protobuf-java/pom.xml
 17494 Thu Jan 13 23:05:12 EET 2011 META-INF/maven/commons-lang/commons-lang/pom.xml
  3556 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-handler/pom.xml
  1585 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-resolver/pom.xml
  1924 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-transport/pom.xml
  3587 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-codec/pom.xml
 17788 Thu Jul 09 12:28:38 EEST 2020 META-INF/maven/io.netty/netty-transport-native-epoll/pom.xml
 46166 Tue Aug 18 13:10:16 EEST 2020 META-INF/maven/io.netty/netty-tcnative-boringssl-static/pom.xml
 11609 Wed Nov 05 23:26:00 EET 2014 META-INF/maven/commons-codec/commons-codec/pom.xml
 13290 Thu Apr 14 09:17:56 EEST 2016 META-INF/maven/commons-io/commons-io/pom.xml
 27495 Fri Jun 09 11:38:14 EEST 2017 META-INF/maven/org.apache.commons/commons-lang3/pom.xml
 19206 Sat Jul 05 20:11:36 EEST 2014 META-INF/maven/commons-logging/commons-logging/pom.xml
 20716 Thu Oct 24 01:20:20 EEST 2013 META-INF/maven/commons-configuration/commons-configuration/pom.xml
  3854 Mon Feb 15 12:26:12 EET 2021 META-INF/maven/org.apache.pulsar/pulsar-functions-runtime-all/pom.xml

I hope #9640 gets addressed asap.

@lhotari
Copy link
Member Author

lhotari commented Mar 9, 2021

I liked the motivation. I am also good with making pulsar-io-core as provided dependency. However, I see you also turn some of the dependencies shared between the Pulsar runtime and connector implementation to the provided scope. I am not comfortable with that change. Because even they are the same dependency, they will be loaded via different class loaders. I am not sure turning them into provided will not break any connector implementation.

@sijie Thank you for reviewing. Yes you are right that it's not fine to change the dependencies. I verified some details and wrote a report about that in the previous comment. It's indeed that #9246 (open issue #9640) is causing the issues and it also pollutes the classes that are available in the current parent classloader of the function user code classloader.

It seems that it would be fine to make the dependencies provided which are part of pulsar-functions/runtime-all/pom.xml since that creates the java-instance.jar . The size of these dependencies is very low (in 2.7.0, it's about 2MB) so it wouldn't save much in the overall size which was the goal of this PR.
Since that's the case, I'll close this PR.

I hope #9640 gets addressed asap since I think it could be considered as a blocker for the 2.8.0 release.
It seems that there's a PR #9842 in WIP state currently. Great!

@lhotari lhotari closed this Mar 9, 2021
@lhotari
Copy link
Member Author

lhotari commented Mar 9, 2021

I already checked in #9807 and #9808 that are including only the connectors that are actually tested and squeezed some extra 100s of MB

One more: #9817

Thank you @merlimat for putting effort in this. Just what I needed. These will unblock me in the refactored Pulsar CI experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exclude Pulsar Functions Worker dependencies from Pulsar IO .nar files

6 participants