Skip to content

NPE in AWS SDK v2 + S3 Access Grants when reading large Iceberg tables with S3FileIO (Spark, JDK17/JDK21) #14942

@manjum-a11y

Description

@manjum-a11y

Apache Iceberg version

1.7.2

Query engine

Spark

Please describe the bug 🐞

When reading a large Iceberg table from S3 using S3FileIO with S3 Access Grants enabled, Spark jobs intermittently fail with a NullPointerException inside the AWS SDK v2 AttributeMap$Builder.resolveValue, called from S3AccessGrantsIdentityProvider.resolveIdentity.

This only appears under high concurrency / large datasets (e.g., spark.read.table(...).count() over many files). Smaller tables or lower parallelism may run successfully, but increasing parallelism makes the failure reproducible.

The error message from the AWS SDK is:

Encountered a null value when resolving configuration attributes. This is commonly caused by concurrent modifications to non-thread-safe types. Ensure you're synchronizing access to all non-thread-safe types.

From the Iceberg side we are using S3FileIO with S3 Access Grants configured according to the docs, and the S3 client is built via S3Client.builder() with S3FileIOProperties.applyS3AccessGrantsConfigurations(...) (or equivalent).

java.lang.NullPointerException: Cannot invoke "software.amazon.awssdk.utils.AttributeMap$Value.get(software.amazon.awssdk.utils.AttributeMap$LazyValueSource)" because "value" is null
    at software.amazon.awssdk.utils.AttributeMap$Builder.resolveValue(AttributeMap.java:396)
    at software.amazon.awssdk.utils.AttributeMap$Builder.buildResolvedMap(AttributeMap.java:371)
    at software.amazon.awssdk.utils.AttributeMap$Builder.build(AttributeMap.java:358)
    ...
    at software.amazon.awssdk.s3accessgrants.plugin.S3AccessGrantsIdentityProvider.resolveIdentity(S3AccessGrantsIdentityProvider.java:...)
    ...
    at software.amazon.awssdk.services.s3.S3Client.getObject(S3Client.java:...)
    ...
    at org.apache.iceberg.io.ResolvingFileIO.newInputFile(ResolvingFileIO.java:...)
    at org.apache.iceberg.io.FileIO.newInputFile(FileIO.java:...)
    ...
    at org.apache.iceberg.spark.source.BaseDataReader.next(BaseDataReader.java:...)
    at org.apache.iceberg.spark.source.SparkBatchScan$$anon$1.next(SparkBatchScan.scala:...)
    ...

We have already tried these below combos where still the NPE issue persist

Iceberg versions
1.7.2 and upgraded to 1.10.0 → NPE persists in both.
AWS SDK v2 versions
Tried 2.24.6, 2.30.31, 2.32.1→ NPE persists across all.
S3 Access Grants plugin versions
Tried 2.0.2 and 2.3.0 → NPE persists across both.
Spark / JDK combinations
Spark 3.5.6 with JDK17 and Spark 4.0.1 (JDK21 inside image) → same NPE in both.
Parallelism tuning - Reduced spark.sql.shuffle.partitions / spark.default.parallelism → can change frequency but does not reliably remove the NPE on large tables.

Could you please help me to understand the issue:
1. Known issue?
Are you aware of any known concurrency problems between Iceberg’s S3FileIO S3 Access Grants integration and AWS SDK v2 / aws-s3-accessgrants-java-plugin that could cause AttributeMap$Builder.resolveValue to throw an NPE under high Spark parallelism?
2. Recommended version matrix?
Is there a recommended or validated combination of:
Iceberg version
AWS SDK v2 version
aws-s3-accessgrants-java-plugin version
for running S3 Access Grants with S3FileIO in a high‑concurrency Spark environment?
3. Client factory / configuration guidance?
From Iceberg’s side, is there any specific guidance on how the S3 client factory should be implemented (or additional S3FileIO / S3AG configuration) to avoid shared, non‑thread‑safe state that might trigger this NPE?

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions