Skip to content

Fix the LICENSE/NOTICES files with the missing dependencies/versions#16043

Open
jbonofre wants to merge 1 commit intoapache:1.10.xfrom
jbonofre:gh-16013
Open

Fix the LICENSE/NOTICES files with the missing dependencies/versions#16043
jbonofre wants to merge 1 commit intoapache:1.10.xfrom
jbonofre:gh-16013

Conversation

@jbonofre
Copy link
Copy Markdown
Member

This closes #16013

Copy link
Copy Markdown
Contributor

@mxm mxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing this promptly @jbonofre!

Comment thread flink/v2.0/flink-runtime/LICENSE
Comment thread aws-bundle/LICENSE Outdated
Comment thread azure-bundle/LICENSE
Comment thread flink/v2.0/flink-runtime/LICENSE
@manuzhang
Copy link
Copy Markdown
Member

I suppose a lot of dependencies are missing from open-api/LICENSE based on #15872

@jbonofre
Copy link
Copy Markdown
Member Author

@manuzhang I updated the LICENSE/NOTICE in the bundles and flink/spark runtimes. I'm gonna check open-api.

@jbonofre
Copy link
Copy Markdown
Member Author

@manuzhang I updated LICENSE and NOTICE in open-api.

@manuzhang
Copy link
Copy Markdown
Member

@jbonofre

Codex found 61 resolved module names missing from open-api/LICENSE. Confirmed packaged examples include (I verified some as well):

  • software.amazon.s3.analyticsaccelerator:analyticsaccelerator-s3:1.3.0; the file ends after aws-s3-accessgrants-java-plugin at open-api/LICENSE:1873
  • com.google.api:api-common:2.52.0, missing near the GCP section before gax at open-api/LICENSE:638
  • com.google.protobuf:protobuf-java:4.29.4 and protobuf-java-util:4.29.4
  • io.grpc:* modules, io.opentelemetry:* modules, io.opencensus:*
  • org.junit.jupiter:, org.junit.platform:, org.opentest4j:opentest4j
  • org.checkerframework:checker-qual, org.roaringbitmap:RoaringBitmap

High: Several entries that are present still have stale versions compared with the same resolved runtime classpath. Examples:

  • Jackson is listed as 2.18.4 / 2.18.4.1 at open-api/LICENSE:596, but resolves to 2.21 / 2.21.2
  • gson is listed as 2.11.0 at open-api/LICENSE:852, but resolves to 2.12.1
  • error_prone_annotations is listed as 2.10.0 at open-api/LICENSE:857, but resolves to 2.38.0
  • commons-codec is listed as 1.17.1 at open-api/LICENSE:1025, but resolves to 1.19.0
  • commons-logging is listed as 1.2 at open-api/LICENSE:1043, but resolves to 1.3.0
  • org.apache.httpcomponents:httpclient is listed as 4.5.13 at open-api/LICENSE:1319, but resolves to 4.5.14

@jbonofre
Copy link
Copy Markdown
Member Author

jbonofre commented Apr 22, 2026

@manuzhang did you check on the 1.10.x branch? I'm surprised: for instance, jackson is 2.18.x on the branch.

This PR is on the 1.10.x branch.

I forgot to push the new entries, let me do that.

@manuzhang
Copy link
Copy Markdown
Member

@jbonofre I was the one that bumped jackson version in #15847 for 1.10.x :)

@jbonofre
Copy link
Copy Markdown
Member Author

@manuzhang then, I suspect some other LICENSE should be updated. Let me do verify.

@jbonofre
Copy link
Copy Markdown
Member Author

@manuzhang open-api LICENSE and NOTICE files should be good now. I'm doing a complete final pass.

@manuzhang
Copy link
Copy Markdown
Member

@jbonofre We are getting closer but still some entries are missing and some stale.

Missing license entries:

  • javax.servlet.jsp:jsp-api
  • org.bouncycastle:bcprov-jdk18on
  • org.eclipse.jetty.toolchain:jetty-jakarta-servlet-api
  • org.junit.jupiter:junit-jupiter
  • org.junit.jupiter:junit-jupiter-api
  • org.junit.jupiter:junit-jupiter-engine
  • org.junit.jupiter:junit-jupiter-params
  • org.junit.platform:junit-platform-commons
  • org.junit.platform:junit-platform-engine
  • org.opentest4j:opentest4j
  • org.roaringbitmap:RoaringBitmap

Stale license entries:

  • open-api/LICENSE:596: jackson-annotations lists 2.21.2, resolved is 2.21
  • open-api/LICENSE:908: gson lists 2.11.0, resolved is 2.12.1
  • open-api/LICENSE:913: error_prone_annotations lists 2.10.0, resolved is 2.38.0
  • open-api/LICENSE:929: failureaccess lists 1.0.3, resolved is 1.0.2; guava lists 33.4.8-jre, resolved is 33.4.0-jre
  • open-api/LICENSE:1195: commons-codec lists 1.17.1, resolved is 1.19.0
  • open-api/LICENSE:1213: commons-logging lists 1.2, resolved is 1.3.0
  • open-api/LICENSE:1339: Netty entries list 4.2.4.Final, but resolved entries are 4.1.124.Final or 4.1.118.Final/4.1.112.Final depending on module
  • open-api/LICENSE:1603: Arrow entries list 15.0.2, resolved is 17.0.0
  • open-api/LICENSE:1699: httpclient lists 4.5.13, resolved is 4.5.14

@jbonofre
Copy link
Copy Markdown
Member Author

@manuzhang good catch. I'm updating.

@jbonofre
Copy link
Copy Markdown
Member Author

jbonofre commented Apr 22, 2026

@manuzhang here we are 😄 It should be good now.

@manuzhang
Copy link
Copy Markdown
Member

@jbonofre One final comment

open-api/LICENSE:2000 includes org.junit.platform:junit-platform-launcher and the junit-platform-suite-* entries, but those are not in testFixturesRuntimeClasspath,

Comment thread open-api/LICENSE Outdated
@rdblue
Copy link
Copy Markdown
Contributor

rdblue commented Apr 23, 2026

@jbonofre, this repo doesn't use the fix(...) convention for PR titles. Could you fix the summary?

@jbonofre
Copy link
Copy Markdown
Member Author

jbonofre commented Apr 23, 2026

@rdblue absolutely. Thanks for the reminder!

@amogh-jahagirdar
Copy link
Copy Markdown
Contributor

@jbonofre @rdblue I also thought before we did all the LICENSE/NOTICE changes we were first going to do a vetting of all the dependencies we're shipping and scope that down wherever applicable? Then we'd update LICENSE/NOTICE.
I guess it's an open question if we want to do that work for a patch release or not, but since concern was expressed on what we're shipping in runtime jars I thought we'd do that first.

@jbonofre
Copy link
Copy Markdown
Member Author

@amogh-jahagirdar my understanding is to do that on main. The dependencies set should not have changed on 1.10.x. But if you think so, I'm happy to do an analysis.

@jbonofre jbonofre changed the title fix(LICENSE/NOTICE): update license with the missing dependencies and update copyright year chore: update license with the missing dependencies and update copyright year Apr 23, 2026
@jbonofre
Copy link
Copy Markdown
Member Author

@rdblue PR title updated.

@amogh-jahagirdar
Copy link
Copy Markdown
Contributor

Thanks @jbonofre so there were dependency upgrades to address CVEs (like the jackson upgrade that @manuzhang referenced earlier). Theoretically those patch releases shouldn't change anything but my understanding was any dependency upgrade needs to be scrutinized, or maybe my understanding there was wrong?

Comment thread aws-bundle/LICENSE Outdated

--------------------------------------------------------------------------------

Group: com.github.ben-manes.caffeine Name: caffeine Version: 2.9.3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Versions get stale and should not be included in the license files.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the change I did on main, but for 1.10.x, I keep the format consistently.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable to me, thanks for the context.

Comment thread NOTICE Outdated

Apache Iceberg
Copyright 2017-2025 The Apache Software Foundation
Copyright 2017-2026 The Apache Software Foundation
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done in a separate PR. PRs should generally just do one thing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I consider it's a fix on the NOTICE, I included there. But fine to split.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gonna split in two PRs: this one will content the minimal change to have the LICENSE/NOTICE correct (the PR title will be updated accordingly). Let me create another PR specific to copyright year update. Thanks for the suggestion @rdblue !

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #16099

Comment thread spark/v4.0/spark-runtime/LICENSE Outdated
Copyright: 2014-2024 The Apache Software Foundation
Home page: https://parquet.apache.org/
License: http://www.apache.org/licenses/LICENSE-2.0
License: Apache License, Version 2.0 - http://www.apache.org/licenses/LICENSE-2.0.txt
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are unnecessary and shouldn't be included here. We can talk about whether to have more information on this line, but it isn't related to the purpose of this PR, which is to update the files for correctness. Including unnecessary changes increases what needs to be reviewed (or in this case scrolled past) and that can cause things to be missed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a request from @manuzhang for consistency.

I'm fine to revert.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are large enough without other changes. Style updates like this should be in separate PRs. Thanks!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I feel the pain to go back and forth in such a large PR. Maybe we can just leave out the style issues for the patch release?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me (and it was my initial change 😄 ). I propose to take a step back here with the following:

  1. Let me list the dependencies in each "modules" (bundle and runtime artifacts) with a first pass for dependencies requiring "discussions" (I'm thinking of Eclipse Collections for instance).
  2. I'm updating this PR with the minimal changes to have LICENSE/NOTICE files correct.
  3. Depending of our finding on 1, I will update the PR accordingly.

Does it work for you guys?

Thanks again for your help on this one!


--------------------------------------------------------------------------------

This binary artifact contains Eclipse Collections.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this used? Is it pulled in transitively?

We don't use Eclipse collections, so if this is a new direct dependency, we should remove it.

Also, for Spark we want to check what Jars are already included in the Spark bundle and exclude them so that we don't ship duplicates or conflicts.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's directly bundled in the runtime jar:

....
  2913 Fri Feb 01 00:00:00 CET 1980 org/eclipse/collections/api/LazyIntIterable.class
  2961 Fri Feb 01 00:00:00 CET 1980 org/eclipse/collections/api/LazyFloatIterable.class
  2937 Fri Feb 01 00:00:00 CET 1980 org/eclipse/collections/api/LazyLongIterable.class
  2985 Fri Feb 01 00:00:00 CET 1980 org/eclipse/collections/api/LazyDoubleIterable.class
 13693 Fri Feb 01 00:00:00 CET 1980 org/eclipse/collections/api/ParallelIterable.class

This is coming from arrow-vector (transitive).

I believe that it has not been introduced for 1.10.2 but before.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just fix license issues in the patch release and fix dependency issues on the main branch?

@jbonofre
Copy link
Copy Markdown
Member Author

@amogh-jahagirdar I think it's reasonable indeed. I also think that several issues exist for a while and probably need a cleanup (I'm thinking on Eclipse Collections for instance).

@rdblue
Copy link
Copy Markdown
Contributor

rdblue commented Apr 23, 2026

@rdblue PR title updated.

Sorry, I should have been more clear. This project doesn't follow the "conventional commits" conventions for PR titles. I'd be fine having a discussion about this in the community, but I don't think that it is helpful to have conflicting conventions.

The convention that we follow is to list the high-level areas that are being changed, like "API" and "Core", or "Spark 4.1", followed by a colon, and then a short and direct description.

@jbonofre
Copy link
Copy Markdown
Member Author

@rdblue ok, good. I'm surprised it comes now because I saw several PRs (including mine) using conventional commits style (that's why I was confused, sorry about that). Fair enough, I'm updating to use the recommended format. Thanks!

@jbonofre jbonofre changed the title chore: update license with the missing dependencies and update copyright year Core: fix the LICENSE/NOTICES files with the missing dependencies and update copyright year Apr 24, 2026
@jbonofre jbonofre changed the title Core: fix the LICENSE/NOTICES files with the missing dependencies and update copyright year Fix the LICENSE/NOTICES files with the missing dependencies and update copyright year Apr 24, 2026
@jbonofre
Copy link
Copy Markdown
Member Author

jbonofre commented Apr 24, 2026

@manuzhang @rdblue @amogh-jahagirdar I changed the PR to focus on the minimal changes needed to fix LICENSE and NOTICE files.

I'm preparing the dependencies list for each artifact (bundle and runtime) identifying the dependencies worth to discuss (probably excluding not necessary ones and so updating LICENSE and NOTICE files in a new iteration). I will do that for both 1.10.x and main branches.

@jbonofre jbonofre changed the title Fix the LICENSE/NOTICES files with the missing dependencies and update copyright year Fix the LICENSE/NOTICES files with the missing dependencies/versions Apr 24, 2026
@manuzhang
Copy link
Copy Markdown
Member

I'm surprised it comes now because I saw several PRs (including mine) using conventional commits style (that's why I was confused, sorry about that).

That might be that other iceberg projects are following conventional commits style. I created #16101 to add PR title check workflow.

@manuzhang
Copy link
Copy Markdown
Member

I will do that for both 1.10.x and main branches.

I'd suggest fixing dependencies issue just on the main branch, since it could be too many dependency changes for a bugfix release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants