Skip to content

Conversation

@slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Dec 28, 2025

Why are the changes needed?

To fully address two recently disclosed CVEs in lz4-java:

  • CVE-2025-12183 (Out-of-bounds read in older versions ≤1.8.0, leading to DoS or info leak). Fixed starting from the community fork (at.yawk.lz4) in 1.8.1 and carried forward.
  • CVE-2025-66566 (High severity: Information leak in safe Java decompressor due to insufficient buffer clearing, affecting ≤1.10.0). Explicitly fixed in 1.10.1.

The current version in main (1.10.2) already includes the fix from 1.10.1 and subsequent improvements. This PR ensures we are on the latest patch release to eliminate any vulnerability scanner alerts and benefit from minor bug fixes/performance improvements.

References:

@slfan1989 slfan1989 force-pushed the bump-lz4-java-1.10.2 branch from 3906ecc to 805cd56 Compare December 28, 2025 13:30
@slfan1989
Copy link
Contributor Author

@huaxingao Could you please help review this PR? Thank you very much!

@huaxingao
Copy link
Contributor

I’m seeing org.lz4:lz4-java:1.8.0 still present on Spark 3.5/4.0 compileClasspath. Does this need to be fixed too?

@slfan1989
Copy link
Contributor Author

I’m seeing org.lz4:lz4-java:1.8.0 still present on Spark 3.5/4.0 compileClasspath. Does this need to be fixed too?

Thanks for the review!

You're right — even as a transitive dependency, the vulnerable org.lz4:lz4-java:1.8.0 should be fully excluded to avoid scanner alerts.

I'll update the PR to add a global exclude group: 'org.lz4', module: 'lz4-java' in the root subprojects block (covering Spark, Flink, and Kafka Connect). This will completely remove the old version from all classpaths, while keeping the capability resolution rule as a safety net.

Pushing the change shortly. Thanks again!

project.name.startsWith('iceberg-kafka-connect')) {

configurations.all {
exclude group: 'org.lz4', module: 'lz4-java'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a global exclusion rule. After applying this rule, we can use the following command for validation.

gradlew --parallel \
  -DsparkVersions=4.0 \
  -DscalaVersion=2.13 \
  :iceberg-spark:iceberg-spark-4.0_2.13:dependencies \
  :iceberg-spark:iceberg-spark-extensions-4.0_2.13:dependencies \
  :iceberg-spark:iceberg-spark-runtime-4.0_2.13:dependencies \
  --configuration compileClasspath \
  -Pquick=true \
  --refresh-dependencies|grep "lz4"

The result is as follows:

|    +--- at.yawk.lz4:lz4-java:1.10.2
|    +--- at.yawk.lz4:lz4-java:1.10.2
|    +--- at.yawk.lz4:lz4-java:1.10.2
|    +--- at.yawk.lz4:lz4-java:1.10.2
|    +--- at.yawk.lz4:lz4-java:1.10.2

@slfan1989
Copy link
Contributor Author

@huaxingao Could you please help review this PR again? Thank you very much!

build.gradle Outdated
}
}

configurations.all {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change still needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and suggestions! I agree—this part can be removed. I'll make further improvements and update the PR soon.

Comment on lines +1238 to +1241
if (project.name.startsWith('iceberg-spark') ||
project.name.startsWith('iceberg-flink') ||
project.name.startsWith('iceberg-delta-lake') ||
project.name.startsWith('iceberg-kafka-connect')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why only these module, shouldn't we doing it for all ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, only a few modules depend on org.lz4:lz4-java. Promoting this dependency to the project level is safe and will improve consistency and standardization in dependency management. I plan to implement this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to keep this rule scoped (e.g., Spark/Flink/Kafka Connect) because the vulnerable org.lz4:lz4-java is coming from the Spark/Flink/Kafka dependency trees?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huaxingao Thank you very much for reviewing the code and for the helpful suggestions.

From my perspective, I also lean toward scoping this rule to the relevant components (e.g., Spark / Flink / Kafka Connect), since the current org.lz4:lz4-java vulnerability is primarily introduced via transitive dependencies in the Spark/Flink/Kafka dependency trees. This would help reduce the impact on other unrelated modules.

From my side: +1 to scoping the rule to Spark / Flink / Kafka Connect.

@singhpk234 Can you agree with this improvement?

@slfan1989 slfan1989 force-pushed the bump-lz4-java-1.10.2 branch from 9a950df to 7a7eee9 Compare January 3, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants