Flink: Dynamic Iceberg Sink: Add HashKeyGenerator / RowDataEvolver / TableUpdateOperator#13277
Conversation
…TableUpdateOperator
This change adds the following components for the Flink Dynamic Iceberg Sink:
*** HashKeyGenerator
A hash key generator which will be used in DynamicIcebergSink class (next PR) to
implement one of Iceberg's DistributionModes (NONE, HASH, RANGE).
The HashKeyGenerator is responsible for creating the appropriate hash key for Flink's keyBy
operation. The hash key is generated depending on the user-provided DynamicRecord and the table
metadata. Under the hood, we maintain a set of Flink {@link KeySelector}s which implement the
appropriate Iceberg {@link DistributionMode}. For every table, we randomly select a consistent
subset of writer subtasks which receive data via their associated keys, depending on the chosen
DistributionMode.
Caching ensures that a new key selector is also created when the table metadata (e.g. schema,
spec) or the user-provided metadata changes (e.g. distribution mode, write
parallelism).
*** RowDataEvolver
RowDataEvolver is responsible to change the input RowData to make it compatible
with the target schema. This is done when:
1. The input schema has fewer fields than the target schema.
2. The table types are wider than the input type.
3. The field order differs for source and target schema.
The resolution is as follows:
In the first case, we would add a null values for the missing field (if the field is optional).
In the second case, we would convert the data for the input field to a wider type, e.g. int (input type) => long (table type).
In the third case, we would rearrange the input data to match the target table.
*** DynamicUpdateOperator
A dedicated operator to updating the schema / spec for the table associated with
a DynamicRecord.
|
|
||
| private DynamicSinkUtil() {} | ||
|
|
||
| static List<Integer> getEqualityFieldIds(List<String> equalityFields, Schema schema) { |
There was a problem hiding this comment.
It is a bit strange to me to ignore the empty equalityFields.
Maybe a javadoc to highlight this?
There was a problem hiding this comment.
What do you mean by ignoring empty equalityFields? There may be none defined.
There was a problem hiding this comment.
If I understand correctly, if the equalityFields is empty, then we fall back to the Schema defined stuff.
I feel that if (equalityFields==null) then Schema is natural, but handling empty list as not set is a bit strange.
Minimally we need to document it
There was a problem hiding this comment.
Can we have at least a javadoc for this?
| Catalog catalog = catalogLoader.loadCatalog(); | ||
| this.updater = | ||
| new TableUpdater( | ||
| new TableMetadataCache(catalog, cacheMaximumSize, cacheRefreshMs), catalog); |
There was a problem hiding this comment.
Could we share the TableMetadataCache in a static way, so it is shared between the operator instances?
There was a problem hiding this comment.
Yes, possible. Let me look into it.
There was a problem hiding this comment.
I would like to defer this change to a follow-up. There are some challenges with regard to concurrent get / put operations on the cache.
| } | ||
| } | ||
|
|
||
| case RANGE: |
There was a problem hiding this comment.
Will the RANGE distribution mode work with the dyntable sink?
Range needed statistics collection and complicated infrastructure to work. It relied on Partitioners to direct the records to the correct writer.
There was a problem hiding this comment.
For now we fall back to HASH. Adding support for RANGE in the Dynamic Sink will be a follow-up.
| private final Integer specId; | ||
| private final Schema schema; | ||
| private final PartitionSpec spec; | ||
| private final List<String> equalityFields; |
There was a problem hiding this comment.
Perhaps. Need to double-check.
There was a problem hiding this comment.
So far, we are always using List for equality fields (IcebergSink V1 / V2). It doesn't seem though that this is required. Updated everywhere. See 0412697.
There was a problem hiding this comment.
There is only a single place where this touches a public API.
Maybe for backward compatibility we could use Collection there. Or we just deprecate the old methods, add a new one (only ones which are currently used)
Seems like a good change to me, but I would ask @stevenzwu. He might know more why the equalityFields are stored in a List instead of a Set.
| FileFormat format, | ||
| Map<String, String> writeProperties, | ||
| List<Integer> equalityFieldIds, | ||
| Set<Integer> equalityFieldIds, |
There was a problem hiding this comment.
This is public API, since we don't have annotation on this.
There was a problem hiding this comment.
Ok. I've reverted this change in favor of an internal conversion. See 05fd580
|
Merged to main. |
|
Thank for reviewing / merging @pvary 🙏 |
|
I'll prepare the backports. |
…TableUpdateOperator (apache#13277)
…volver / TableUpdateOperator to Flink 1.19 / 1.20 (apache#13303) backports apache#13277
…volver / TableUpdateOperator to Flink 1.19 / 1.20 (apache#13303) backports apache#13277
…TableUpdateOperator (apache#13277)
…volver / TableUpdateOperator to Flink 1.19 / 1.20 (apache#13303) backports apache#13277
This change adds the following components for the Flink Dynamic Iceberg Sink:
HashKeyGenerator
A hash key generator which will be used in DynamicIcebergSink class (next PR) to implement one of Iceberg's DistributionModes (NONE, HASH, RANGE).
The HashKeyGenerator is responsible for creating the appropriate hash key for Flink's keyBy operation. The hash key is generated depending on the user-provided DynamicRecord and the table metadata. Under the hood, we maintain a set of Flink KeySelectors which implement the appropriate Iceberg DistributionMode. For every table, we randomly select a consistent subset of writer subtasks which receive data via their associated keys, depending on the chosen DistributionMode.
Caching ensures that a new key selector is also created when the table metadata (e.g. schema, spec) or the user-provided metadata changes (e.g. distribution mode, write parallelism).
RowDataEvolver
RowDataEvolver is responsible to change the input RowData to make it compatible with the target schema. This is done when:
The resolution is as follows:
In the first case, we would add a null values for the missing field (if the field is optional). In the second case, we would convert the data for the input field to a wider type, e.g. int (input type) => long (table type). In the third case, we would rearrange the input data to match the target table.
DynamicUpdateOperator
A dedicated operator to updating the schema / spec for the table associated with a DynamicRecord.