Core: Auto promote sorted column metrics to truncate(16) #2240

szehon-ho · 2021-02-16T09:39:03Z

-- Add flag "write.metadata.sorted.metrics.default" which defaults to "full"
-- Fix unrelated bug in configuring nested Orc Metrics
-- Welcome feedback how to avoid propagation of sortOrder through AppenderFactories

szehon-ho · 2021-02-16T09:44:57Z

Change Background:

A suggestion from @aokolnychyi , we find that users do not tend to fine-tune column metrics. The default truncate(16) does not do much for certain sortable types, and reduces benefits of predicate pruning. Cost of storing full metrics should be small as number of sorted columns will be generally small.

aokolnychyi · 2021-02-17T19:36:22Z

Thanks for the PR, @szehon-ho! Let me take a look now.

aokolnychyi · 2021-02-17T19:50:39Z

core/src/main/java/org/apache/iceberg/TableProperties.java

  public static final String METRICS_MODE_COLUMN_CONF_PREFIX = "write.metadata.metrics.column.";
  public static final String DEFAULT_WRITE_METRICS_MODE = "write.metadata.metrics.default";
  public static final String DEFAULT_WRITE_METRICS_MODE_DEFAULT = "truncate(16)";
+  public static final String SORTED_COL_DEFAULT_METRICS_MODE = "write.metadata.sorted.metrics.default";


Actually, I think we should simplify it a bit. The use case I was talking about is when the user configures the default sort mode as none or counts but creates a table with a sort order. In that case, we should promote the metrics for sort columns to be at least truncate(16) unless the user sets a mode for sort columns explicitly. It is probably too dangerous to promote to full as the values may be too long. I guess truncate(16) is a reasonable default for sort columns and it will apply only to string and binary columns. Longs/integers won't be affected.

Internally, we may want to change the default value to counts instead of truncate(16) for tables with many columns as we have a lot of tables with 100+ columns and we don't want tons of unnecessary metadata. But I am not sure that the community wants to do the same.

Thoughts, @szehon-ho @RussellSpitzer @rdblue?

I'd also say the promotion should be implicit and we don't need extra table properties.

I think i'm also in favor of just making the change without a flag to configure. Just to make configuration have one parameter less, do we really have a lot of use cases where truncate(16) isn't the right thing to do? And if it isn't we can just let users manually reconfigure those columns individually if they need.

@szehon-ho, could you update the PR to match the behavior in this comment, please?

Yes I'll take a look, thanks for the feedback guys

Recap if I understand correctly: if default < truncate(16), then sort columns => truncate(16). And no table property (user overrides the sort col metric property itself)

That’s just my suggestion, I can always be convinced otherwise too 😊

No you guys have a point, full is unnecessary if someone sorts by a huge text for example, and there's already a good way to override the config. Updated the PR and changed the title to reflect.

I agree with this behavior and I also would not add a new table property.

aokolnychyi · 2021-02-25T20:04:09Z

Let me do another round now.

aokolnychyi · 2021-02-25T21:11:26Z

I see that the current approach tries to promote the sort columns on read, which is one way of doing this. I think I'd prefer the approach we took with validateReferencedColumns that does its validation during table creation and in a few other relevant places. If we do it that way, we don't need to modify any ORC and Parquet code and the scope of the change will be really small.

Basically, we could modify TableMetadata#newTableMetadata as this:

    MetricsConfig metricsConfig = MetricsConfig.fromProperties(properties);
    metricsConfig.validateReferencedColumns(schema);
    MetricsConfig freshMetricsConfig = metricsConfig.copyWithPromotedSortColumns(freshSortOrder);

    Map<String, String> freshProperties = Maps.newHashMap(properties);
    freshProperties.putAll(freshMetricsConfig.toProperties());

That will need two new methods in MetricsConfig (let's not make them public):

  MetricsConfig copyWithPromotedSortColumns(SortOrder sortOrder) {
    // create a copy
    // check the default mode
    // if none or count -> iterate through the sort order fields and promote as needed if not set by the user
  }

  Map<String, String> toProperties() {
    // map into table props
  }

What are your thoughts, @RussellSpitzer @rdblue @holdenk @shardulm94 @pvary?

RussellSpitzer · 2021-02-25T21:57:16Z

Would that also catch alters? @aokolnychyi? I don't think that's unreasonable since it does make sense to keep it off the read path if possible ...

aokolnychyi · 2021-02-25T22:00:35Z

We will catch altering sort order separately just like we catch a few other places for validateReferencedColumns.

rdblue · 2021-02-26T01:17:12Z

I don't think I would approach the problem as @aokolnychyi is suggesting. I thought that this would promote when accessing the metrics. So basically MetricsConfig.fromProperties becomes MetricsConfig.forTable so that it can be based on both the properties and the sort order for a table. I think that would be a reasonably small change.

The advantage of that approach is that the sort order doesn't make changes to table properties. Those are always configured by users.

aokolnychyi · 2021-02-26T01:45:31Z

So basically MetricsConfig.fromProperties becomes MetricsConfig.forTable so that it can be based on both the properties and the sort order for a table.

The problem is that we build MetricsConfig in places where we don't have access to the table object. We made a recent change to make BaseTable serializable but we would need to modify quite some places to pass it around. Also, I am not sure about a potential performance penalty. We do serialize things like spec and props so maybe it won't be a big deal to serialize the complete table object but someone has to test that. We could pass the sort order around but it anyway seemed like a major work to redo that in all places. Plus, it is easy to miss something. For example, the current change does not capture Flink and our import code in SparkTableUtil but all tests are green.

The advantage of that approach is that the sort order doesn't make changes to table properties. Those are always configured by users.

I think having something like toTableProperties on MetricsConfig will make it possible to migrate to column ids instead of names in the future. If we want to use column ids, then modifying table props seems inevitable to me.

aokolnychyi · 2021-02-26T01:57:38Z

But I do get the concern of modifying table properties. However, it looks there will be a few places where will need to do that anyway (e.g. schema evolution).

rdblue · 2021-02-26T17:29:31Z

My concern isn't modifying table properties specifically. We need to do that so that we update the properties to reflect the user's intent. If we have a setting for a column, renaming the property to carry it through a column rename is preserving the user's intent. But if we update table properties then we are potentially losing what the user chose to do.

For example, what do we do when the sort field is removed and there is a metrics config property? We can't remove a truncate[16] setting because we don't know whether it was added when the field was added to the sort, or if it was specifically set by the table owner. Similarly, we have cases where the owner may set a field to counts and then add it to the sort. I think that the table owner's config should take precedence. So we don't replace the property, but how then does the owner signal that it should be defaulted instead? Last, how do we handle tables that currently have sort orders? If we base metrics on the current sort order then everything starts working automatically. If we don't, then we would need to trigger a commit to update the properties.

Even if we don't pass the table around, I think we should definitely pass the sort order in to get the metrics config. It is probably not the time to refactor and pass the table, but we can pass the sort order like we do the other table config.

aokolnychyi · 2021-02-26T19:43:37Z

For example, what do we do when the sort field is removed and there is a metrics config property?

Yeah, that's the point we won't be able to differentiate if we modified the props. I am convinced now.
If we have enough confidence that this feature will be useful, I am ok if we want to modify all places where we write.

I took a look at how hard will it be to move to Table as that would simplify our life in the future. Since BaseTable uses a proxy for serializing, we only serialize FileIO and the metadata location. We will need to read the metadata file back on each executor. In addition, FileIO may contain a full Hadoop conf. That's why we broadcast FileIO and EncryptionManager in Spark. We could consider broadcasting Table instead of FileIO and EncryptionManager. However, we will still have to read the metadata file from each executor. Is that a big deal?

I am fine either way.

rdblue · 2021-02-26T21:24:04Z

It sounds like it would take some work to serialize Table. I think we'd want to update it so that at least one version of TableMetadata is also carried through to avoid that read on every executor. In the long run, that's probably a good thing to do. But for now we can make easier progress serializing things separately.

Eventually, I think we should broadcast the table and FileIO together to take care of this issue for all of the things that might be expensive to serialize.

szehon-ho · 2021-02-27T11:32:55Z

Thanks guys for the time and discussion (learning a lot on the side).

If I understand correctly, it looks metrics-promotion on read is the way to go due to simplicity, esp in regards to alter table not having to alter auto-promoted metrics properties. But in this approach we have to manually pass in sortOrder in all places of write, and I missed it in the new SparkTableUtil and Flink, which I can add.

szehon-ho · 2021-02-27T18:47:21Z

Edit: and for now as Table is hard to serialize, we can have MetricsConfig.forSortOrder(properties, sortOrder), to make progress, as SortOrder is already serializable.

szehon-ho · 2021-03-02T14:15:03Z

Hi @rdblue @aokolnychyi , so I propagated the SortOrder through the various writers as discussed, including the ones I missed before.

I spent some time verifying it by unit tests (TestMergingMetrics gives good coverage for Flink/Generic/Spark AppenderFactory, and TestSparkDataWrite is an end-to-end test on Spark side). When you get a chance, let me know what you think, thanks again.

aokolnychyi · 2021-03-11T00:50:39Z

Sorry, I was distracted by other things. Let me take a look at this PR tomorrow.

aokolnychyi · 2021-03-11T00:56:02Z

I think we have a similar effort in #2214.

szehon-ho · 2021-03-11T16:40:12Z

Thanks for looking @aokolnychyi . Oh yes what a coincidence, yea if that could be merged then I can rebase and just make the changes to the metrics part.

szehon-ho · 2021-08-14T01:04:33Z

Split out ORC nested metric mode fix out in a new PR : #2977 , which should make this change even smaller

aokolnychyi · 2021-08-17T22:47:24Z

Thanks, @szehon-ho! Let me take a look now.

aokolnychyi

Looks like this one is converging. I did one more pass.

Let's get #2977 in and then rebase this one.

core/src/main/java/org/apache/iceberg/MetricsConfig.java

core/src/main/java/org/apache/iceberg/io/DataWriter.java

core/src/main/java/org/apache/iceberg/MetricsConfig.java

aokolnychyi · 2021-08-18T00:48:33Z

core/src/main/java/org/apache/iceberg/util/SortOrderUtil.java

+    } else {
+      return sortOrder.fields().stream()
+          .map(SortField::sourceId)
+          .map(sid -> sortOrder.schema().findColumnName(sid))


I am not sure it is always correct as not all transforms are order preserving.

For example, if we sort by bucket(id, 8), it does not mean the data is sorted by id.
That's why we added Transform#preservesOrder.

I think you have to filter this stream to include only sort fields whose transforms are order preserving.

After another look with fresh eyes, I don't think my statement above is accurate. If we sort using an order preserving transform, it means the source columns are somehow sorted (but may not be perfectly sorted).

For example, sorting by month(date), means our dates are sorted across months but not necessarily within each month. Still, it probably makes sense to promote such columns. Identity and truncate transforms are also order preserving.

@RussellSpitzer @jackye1995, thoughts?

Yea it makes sense. I wonder a bit the use case for sorting by bucket(id, 8), but yea it does not make too much sense to auto-promote them.

The implementation here looks correct to me now.

data/src/test/java/org/apache/iceberg/io/TestWriterMetrics.java

parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java

aokolnychyi · 2021-08-18T21:51:53Z

@szehon-ho, the other PR is in. Could you rebase this one?

szehon-ho · 2021-08-18T22:55:42Z

Rebased and still working through some of the review comments. Need to take a bit of look at the order preserving transform.

szehon-ho · 2021-08-19T16:24:47Z

@aokolnychyi the comments should be addressed now when you have time for another look, thanks again

rdblue · 2021-08-29T19:36:06Z

core/src/test/java/org/apache/iceberg/TestMetrics.java

+        .asc("stringCol")
+        .asc("dateCol").build();
+    PartitionSpec spec = PartitionSpec.unpartitioned();
+    Table table = TestTables.create(tableDir, "test", SIMPLE_SCHEMA, spec, sortOrder, FORMAT_V2);


Why does this test create a v2 table? Won't this work with v1 as well?

Originally I wanted to not increase test runtime that much, as metrics seemed pretty independent of formatVersion.

Changed to make the tests parameterized to run with v1 and v2, or let me know if prefer the original.

core/src/main/java/org/apache/iceberg/MetricsConfig.java

rdblue

I left a couple comments, but overall I think this looks ready. @aokolnychyi can you take another look?

karuppayya

Left few minor comments.Thanks @szehon-ho for working on this

karuppayya · 2021-09-01T00:38:24Z

data/src/test/java/org/apache/iceberg/io/TestWriterMetrics.java

+  public static Object[][] parameters() {
+    return new Object[][] {
+        {FileFormat.ORC},
+        {FileFormat.PARQUET}


Should we include avro too?

Avro returns null metrics, so it seems usually skipped from Metrics tests.
https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/avro/AvroMetrics.java

karuppayya · 2021-09-01T00:40:06Z

data/src/test/java/org/apache/iceberg/io/TestWriterMetrics.java

+
+  @Before
+  public void setupTable() throws Exception {
+    this.tableDir = temp.newFolder();


nit: tableDir can be local to this method

core/src/main/java/org/apache/iceberg/MetricsConfig.java

core/src/test/java/org/apache/iceberg/TestMetricsModes.java

szehon-ho · 2021-09-03T05:40:42Z

@rdblue and @karuppayya thanks for taking a look! Addressed the comments, hope I understood them correctly. For the original format V2 hardcoding in the test, I parameterize the new tests with v1 and v2, let me know if that's not what we want to do.

aokolnychyi · 2021-09-14T01:45:04Z

Sorry, I was off last week. I'll take a look now.

aokolnychyi

Looks solid to me too. I think we need to use the new method in WriteBuilder in Avro.
I'd also slightly tweak the javadoc.

aokolnychyi · 2021-09-14T01:56:15Z

core/src/main/java/org/apache/iceberg/avro/Avro.java

      withSpec(table.spec());
      setAll(table.properties());
-      metricsConfig(MetricsConfig.fromProperties(table.properties()));
+      metricsConfig(MetricsConfig.forTable(table));


Do we need to do the same in Avro WriteBuilder too?
I don't think we use that method right now but should make sense for consistency.
We already handle that for Parquet.

aokolnychyi · 2021-09-14T01:57:40Z

core/src/main/java/org/apache/iceberg/MetricsConfig.java

    }
  }

+  /**


Can we add a sentence to each method to describe what they do?

Creates a metrics config from table properties. Creates a metrics config for a table.

Super nit: Also, case-sensitivity in param docs seems inconsistent. Can we fix that? I think we usually use lower case letters unless it is a name.

Done and fixed case sensitivity. I also removed the return type documentation as it's now a bit redundant with this javadoc, let me know you prefer to keep it.

core/src/main/java/org/apache/iceberg/MetricsConfig.java

aokolnychyi · 2021-09-14T02:04:36Z

core/src/main/java/org/apache/iceberg/MetricsConfig.java

    }

+    // First set sorted column with sorted column default (can be overridden by user)
+    MetricsMode sortedColDefaultMode = sortedColumnDefaultMode(spec.defaultMode);


Looks clean!

aokolnychyi · 2021-09-14T02:06:43Z

core/src/main/java/org/apache/iceberg/util/SortOrderUtil.java

+    } else {
+      return sortOrder.fields().stream()
+          .map(SortField::sourceId)
+          .map(sid -> sortOrder.schema().findColumnName(sid))


The implementation here looks correct to me now.

-- Fix unrelated bug in configuring nested Orc Metrics -- Welcome feedback how to avoid propagation of sortOrder through AppenderFactories

-- Promote sort columns if default is counts || none to truncate(16) if columns are not explicitly configured.

-- Extend to rest of writers -- Add more unit tests for these writers

aokolnychyi · 2021-09-14T23:57:21Z

Looks great to me. Thanks for the work, @szehon-ho! I'll merge when tests pass.

github-actions bot added core ORC parquet spark labels Feb 16, 2021

aokolnychyi reviewed Feb 17, 2021

View reviewed changes

szehon-ho changed the title ~~Auto promote sorted column metrics to full~~ Auto promote sorted column metrics to truncate(16) Feb 18, 2021

szehon-ho force-pushed the metrics_promotion branch from a2c5899 to 58c6c83 Compare March 2, 2021 13:57

github-actions bot added data flink MR labels Mar 2, 2021

szehon-ho changed the title ~~Auto promote sorted column metrics to truncate(16)~~ Core: Auto promote sorted column metrics to truncate(16) Mar 2, 2021

aokolnychyi mentioned this pull request Mar 11, 2021

Add sort order to writer classes #2214

Merged

aokolnychyi reviewed Aug 18, 2021

View reviewed changes

szehon-ho force-pushed the metrics_promotion branch from 1f82d6d to 75b841b Compare August 18, 2021 22:48

rdblue reviewed Aug 29, 2021

View reviewed changes

core/src/main/java/org/apache/iceberg/MetricsConfig.java Show resolved Hide resolved

rdblue approved these changes Aug 29, 2021

View reviewed changes

karuppayya reviewed Sep 1, 2021

View reviewed changes

aokolnychyi approved these changes Sep 14, 2021

View reviewed changes

Szehon Ho and others added 11 commits September 14, 2021 14:20

Auto promote sorted column metrics to full

9ac749e

-- Fix unrelated bug in configuring nested Orc Metrics -- Welcome feedback how to avoid propagation of sortOrder through AppenderFactories

Create SparkAppenderFactory builder, avoid argument explosion

ea5884f

Address review comments.

36f9d5e

-- Promote sort columns if default is counts || none to truncate(16) if columns are not explicitly configured.

Address review comments

f7ee96b

-- Extend to rest of writers -- Add more unit tests for these writers

Rebase

2859114

Fix merge conflicts

9f427a4

Implement in only BaseWriterFactory and address review comments

3295086

Missed review comments on SortOrder files

a11a89d

Address review comments

1b0c5f6

Add filter for only sorted columns with transforms that preserve order

b8a8f86

Address review comments

5733597

szehon-ho force-pushed the metrics_promotion branch from 74ae12a to 5733597 Compare September 14, 2021 23:04

Address review comments

118c1e8

aokolnychyi merged commit 5d1aa35 into apache:master Sep 15, 2021

stevenzwu mentioned this pull request Nov 30, 2022

Flink: use correct metric config for position deletes #6313

Merged

                   }
                 }
+                /**

Core: Auto promote sorted column metrics to truncate(16) #2240

Core: Auto promote sorted column metrics to truncate(16) #2240

Uh oh!

Conversation

szehon-ho commented Feb 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

szehon-ho commented Feb 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aokolnychyi commented Feb 17, 2021

Uh oh!

aokolnychyi Feb 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aokolnychyi commented Feb 25, 2021

Uh oh!

aokolnychyi commented Feb 25, 2021

Uh oh!

RussellSpitzer commented Feb 25, 2021

Uh oh!

aokolnychyi commented Feb 25, 2021

Uh oh!

rdblue commented Feb 26, 2021

Uh oh!

aokolnychyi commented Feb 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aokolnychyi commented Feb 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdblue commented Feb 26, 2021

Uh oh!

aokolnychyi commented Feb 26, 2021

Uh oh!

rdblue commented Feb 26, 2021

Uh oh!

szehon-ho commented Feb 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

szehon-ho commented Feb 27, 2021

Uh oh!

szehon-ho commented Mar 2, 2021

Uh oh!

aokolnychyi commented Mar 11, 2021

Uh oh!

aokolnychyi commented Mar 11, 2021

Uh oh!

szehon-ho commented Mar 11, 2021

Uh oh!

szehon-ho commented Aug 14, 2021

Uh oh!

aokolnychyi commented Aug 17, 2021

Uh oh!

aokolnychyi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

szehon-ho commented Feb 16, 2021 •

edited

Loading

szehon-ho commented Feb 16, 2021 •

edited

Loading

aokolnychyi Feb 17, 2021 •

edited

Loading

aokolnychyi commented Feb 26, 2021 •

edited

Loading

aokolnychyi commented Feb 26, 2021 •

edited

Loading

szehon-ho commented Feb 27, 2021 •

edited

Loading