Skip to content

Conversation

@ianton-ru
Copy link

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix IN with Iceberg table

Documentation entry for user-facing changes

What is going on.
Query like 'SELECT x FROM iceberg.table WHERE y IN (SELECT z FROM local.table)'
During query planning planner created IcebergIterator.
IcebergIterator takes filters (filter_dag) to make partition pruning, min/max pruning, etc.
And IcebergIterator in constructor creates a background thread to create list of objects in parallel with other work.
https://github.com/Altinity/ClickHouse/blob/antalya-25.8/src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergIterator.cpp#L280
But filters are not ready yet, this separate thread founds that need to complete build and tries to do it.
(SingleThreadIcebergKeysIterator::next -> ManifestFilesPruner::ManifestFilesPruner -> KeyCondition::KeyCondition -> KeyCondition::extractAtomFromTree -> FutureSetFromSubquery::buildOrderedSetInplace)
Meanwhile main thread continues to build plan, and in some moment also try to complete it.
(QueryPlanOptimizations::addStepsToBuildSets -> DelayedCreatingSetsStep::makePlansForSets -> FutureSetFromSubquery::build)
When both build runs in the same moment, we get race condition.
This PR adds mutex to prevent build from different threads.

More correct ways to fix it is to start IcebergIterator thread when plan is fully built, but planner blows my mind. so I made this simple workaround.

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • Tiered Storage (2h)

@github-actions
Copy link

github-actions bot commented Nov 25, 2025

Workflow [PR], commit [cc35886]

@mkmkme
Copy link
Collaborator

mkmkme commented Nov 26, 2025

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ianton-ru
Copy link
Author

Failed tests looks unrelated to this PR

Copy link
Collaborator

@zvonand zvonand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge once tests are finished

@zvonand zvonand merged commit bba5b98 into antalya-25.8 Dec 9, 2025
269 of 280 checks passed
zvonand added a commit that referenced this pull request Dec 9, 2025
…_cluster_request

Fix IN with Iceberg table
@alsugiliazova
Copy link
Member

Verification test: https://github.com/Altinity/clickhouse-regression/blob/main/iceberg/tests/iceberg_engine/iceberg_iterator_race_condition.py

This test reproduces a race condition when executing queries with IN subqueries.
The test verifies that queries with IN subqueries against Iceberg tables return consistent results by comparing iceberg result with MergeTree table result.

What was tested:

  • Basic IN subqueries with s3, s3Cluster, iceberg, icebergS3Cluster, iceberg table from DataLakeCatalog database
  • Couple of complex IN subqueries with multiple conditions, aggregate functions and nesting

The test creates a partitioned Iceberg table with 100 rows, populates a local MergeTree table
with a subset of data, and then runs various IN subquery patterns multiple times to catch any
non-deterministic behavior. To have an expected result, same queries are being executed against another MergeTree table that has the same data and schema as iceberg table.

Verification:

  • On 25.8.9.20496.altinityantalya (latest antalya) all checks failed
  • With PR build all checks passed

@alsugiliazova alsugiliazova added the verified Verified by QA label Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants