Skip to content

Conversation

@ConeyLiu
Copy link
Contributor

@ConeyLiu ConeyLiu commented Oct 8, 2021

Arrow use 16 bytes for all decimal vector, however, the data could be stored as int/long in parquet file for different precision decimal data. We only need to use the int/long arrow vector for int/long backed decimal data. This could improve performance a lot when we do a full table scan on the store_sales table(1TB data scale).

截屏2021-10-08 下午7 55 15

The existed UT should cover the vectorized read case. I could add the extra UTs if needed.

@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented Oct 8, 2021

Hi @rdblue @jackye1995 @nastra, could you help to review this? thanks a lot.

@nastra
Copy link
Contributor

nastra commented Oct 8, 2021

@ConeyLiu I'll try to review this next week. In the meantime could you please run some benchmarks for this PR and attach the results here?

So in your case it would be: ./gradlew :iceberg-spark3:jmh -PjmhIncludeRegex=VectorizedReadFlatParquetDataBenchmark -PjmhOutputPath=benchmark/vectorized-read-flat-parquet-data-result.txt

@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented Oct 9, 2021

before

Benchmark                                                                 Mode  Cnt   Score   Error  Units
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5   1.362 ± 0.092   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5   1.451 ± 0.222   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  10.508 ± 0.893   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5   8.165 ± 1.205   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5   2.880 ± 0.241   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5   2.610 ± 0.152   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5   2.801 ± 0.376   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5   2.469 ± 0.155   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5   2.286 ± 0.136   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5   2.580 ± 0.229   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5   2.684 ± 0.186   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5   2.526 ± 0.767   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5   4.367 ± 0.207   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5   3.964 ± 0.334   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5   1.434 ± 0.120   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5   2.015 ± 0.351   s/op

after

Benchmark                                                                 Mode  Cnt   Score   Error  Units
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5   1.578 ± 0.045   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5   1.757 ± 0.338   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  10.965 ± 1.610   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5   8.143 ± 0.300   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5   2.791 ± 0.078   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5   2.623 ± 0.171   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5   2.570 ± 0.065   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5   2.657 ± 0.692   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5   2.651 ± 0.115   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5   2.547 ± 0.290   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5   2.305 ± 0.124   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5   2.540 ± 0.076   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5   4.176 ± 0.222   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5   4.231 ± 1.574   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5   1.597 ± 0.415   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5   1.720 ± 0.052   s/op

I added decimal benchmark:
before:

Benchmark                                                                        Mode  Cnt   Score   Error  Units
VectorizedReadParquetDecimalBenchmark.readDecimalsIcebergVectorized5k              ss    5  10.623 ± 1.204   s/op
VectorizedReadParquetDecimalBenchmark.readDecimalsSparkVectorized5k                ss    5   8.322 ± 1.146   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsIcebergVectorized5k     ss    5   1.593 ± 0.199   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsSparkVectorized5k       ss    5   1.597 ± 0.064   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsIcebergVectorized5k    ss    5   5.836 ± 0.406   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsSparkVectorized5k      ss    5   3.285 ± 0.230   s/op

after

Benchmark                                                                        Mode  Cnt   Score   Error  Units
VectorizedReadParquetDecimalBenchmark.readDecimalsIcebergVectorized5k              ss    5  10.988 ± 2.355   s/op
VectorizedReadParquetDecimalBenchmark.readDecimalsSparkVectorized5k                ss    5   7.511 ± 0.484   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsIcebergVectorized5k     ss    5   1.748 ± 0.174   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsSparkVectorized5k       ss    5   1.631 ± 0.325   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsIcebergVectorized5k    ss    5   3.623 ± 0.557   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsSparkVectorized5k      ss    5   3.203 ± 0.209   s/op

@rdblue
Copy link
Contributor

rdblue commented Oct 10, 2021

Running CI

@rdblue
Copy link
Contributor

rdblue commented Oct 10, 2021

@ConeyLiu, this looks like an improvement, but I think that it relies too heavily on config properties that must be set the same way everywhere rather than detecting what should be done from inputs.

@ConeyLiu
Copy link
Contributor Author

thanks @rdblue for the review. I will do some refactoring.

@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented Oct 13, 2021

Hi @rdblue, please take a look again, thanks a lot. Also update benchmark results:

Benchmark                                                                 Mode  Cnt  Score   Error  Units
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5  1.320 ± 0.045   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5  1.348 ± 0.044   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  8.095 ± 0.308   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5  8.013 ± 0.412   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5  2.697 ± 0.113   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5  2.224 ± 0.071   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5  2.135 ± 0.030   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5  2.233 ± 0.765   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5  2.223 ± 0.043   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5  2.479 ± 0.096   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5  2.640 ± 0.041   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5  2.498 ± 0.116   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5  4.043 ± 0.486   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5  3.809 ± 0.163   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5  1.504 ± 0.298   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5  1.856 ± 0.295   s/op
Benchmark                                                                        Mode  Cnt  Score   Error  Units
VectorizedReadParquetDecimalBenchmark.readDecimalsIcebergVectorized5k              ss    5  7.822 ± 0.441   s/op
VectorizedReadParquetDecimalBenchmark.readDecimalsSparkVectorized5k                ss    5  7.916 ± 0.908   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsIcebergVectorized5k     ss    5  1.655 ± 0.130   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsSparkVectorized5k       ss    5  1.539 ± 0.104   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsIcebergVectorized5k    ss    5  3.468 ± 0.471   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsSparkVectorized5k      ss    5  3.256 ± 0.587   s/op

break;
case UUID:
case FIXED_WIDTH_BINARY:
case FIXED_LENGTH_DECIMAL:
Copy link
Contributor Author

@ConeyLiu ConeyLiu Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, combined the FIXED_WIDTH_BINARY to use fixedSizeBinaryBatchReader instead of fixedWidthTypeBinaryBatchReader. fixedWidthTypeBinaryBatchReader uses VarBinaryVector, it should use FixedSizeBinaryVector. Please correct me if I am wrong @nastra

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep those changes for FIXED_WIDTH_BINARY are correct. Note that support for FIXED is being added as part of #3029

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I'm not sure we can change stuff for FIXED_LENGTH_DECIMAL. See also

/**
* Method for reading a batch of decimals backed by fixed length byte array parquet data type. Arrow stores all
* decimals in 16 bytes. This method provides the necessary padding to the decimals read. Moreover, Arrow interprets
* the decimals in Arrow buffer as little endian. Parquet stores fixed length decimals as big endian. So, this method
* uses {@link DecimalVector#setBigEndian(int, byte[])} method so that the data in Arrow vector is indeed little
* endian.
*/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


I think the valuesReader.getBuffer(typeWidth) shoud already return a little endian?

public ByteBuffer getBuffer(int length) {
    try {
      return valuesInputStream.slice(length).order(ByteOrder.LITTLE_ENDIAN);
    } catch (IOException e) {
      throw new ParquetDecodingException("Failed to read " + length + " bytes", e);
    }
  }

It seems like we don't need to DecimalVector#setBigEndian(int, byte[]).

And also, the UUID https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#uuid stored as big endian as well.

break;
case UUID:
case FIXED_WIDTH_BINARY:
case FIXED_LENGTH_DECIMAL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep those changes for FIXED_WIDTH_BINARY are correct. Note that support for FIXED is being added as part of #3029

public VectorizedArrowReader(
ColumnDescriptor desc,
Types.NestedField icebergField,
Types.NestedField originalIcebergField,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdblue do we need to keep the original constructor without the originalIcebergField or could it be removed?

@ConeyLiu
Copy link
Contributor Author

ConeyLiu commented Nov 4, 2021

@rdblue @nastra @pvary @kbendick Could you help to review when you have time? Thanks a lot.

@rdblue
Copy link
Contributor

rdblue commented Mar 15, 2022

@ConeyLiu, could you rebase this? I'll take another look so we can get it in. Thanks!

@ConeyLiu
Copy link
Contributor Author

Thanks, @rdblue @kbendick, code has been rebased. Please take a look when you are free.

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ConeyLiu, for this change, I also stumbled on something similar recently.

@jackye1995
Copy link
Contributor

Sorry for the delayed review, was focusing on getting 1.2 release out. I think it's mostly there, added a few comments.

@jackye1995
Copy link
Contributor

@nastra could you also take another look?

@jackye1995 jackye1995 requested a review from nastra March 23, 2023 21:39
@ConeyLiu
Copy link
Contributor Author

Thanks, @jackye1995 @zhongyujiang for the review. Please take another look when you are free.

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this LGTM, can we remove IntBackedDecimalBatchReader / LongBackedDecimalBatchReader / FixedLengthDecimalBatchReader / IntBackedDecimalPageReader / LongBackedDecimalPageReader?

@ConeyLiu could you please run the benchmark on your GH fork and post the link to it? I usually run benchmarks on my fork here and you should have the same action in your fork.

@ConeyLiu
Copy link
Contributor Author

@nastra benchmark results at here

Benchmark                                                                  Mode  Cnt  Score   Error  Units
VectorizedReadFlatParquetDataBenchmark.readBigDecimalsIcebergVectorized5k    ss    5  9.512 ± 0.092   s/op
VectorizedReadFlatParquetDataBenchmark.readBigDecimalsSparkVectorized5k      ss    5  9.206 ± 0.145   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k          ss    5  2.482 ± 0.187   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k            ss    5  2.492 ± 0.139   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k       ss    5  5.154 ± 0.169   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k         ss    5  4.057 ± 0.079   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k        ss    5  4.546 ± 0.342   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k          ss    5  3.897 ± 0.113   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k         ss    5  3.803 ± 0.096   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k           ss    5  3.735 ± 0.185   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k       ss    5  4.021 ± 0.144   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k         ss    5  3.493 ± 0.095   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k          ss    5  4.056 ± 0.321   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k            ss    5  3.177 ± 0.214   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k        ss    5  5.387 ± 0.089   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k          ss    5  6.683 ± 0.130   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k     ss    5  2.283 ± 0.136   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k       ss    5  2.208 ± 0.091   s/op
Benchmark                                                                        Mode  Cnt   Score   Error  Units
VectorizedReadParquetDecimalBenchmark.readDecimalsIcebergVectorized5k              ss    5  12.878 ± 0.512   s/op
VectorizedReadParquetDecimalBenchmark.readDecimalsSparkVectorized5k                ss    5  11.764 ± 0.654   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsIcebergVectorized5k     ss    5   2.711 ± 0.214   s/op
VectorizedReadParquetDecimalBenchmark.readIntBackedDecimalsSparkVectorized5k       ss    5   2.698 ± 0.165   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsIcebergVectorized5k    ss    5   4.728 ± 0.310   s/op
VectorizedReadParquetDecimalBenchmark.readLongBackedDecimalsSparkVectorized5k      ss    5   5.356 ± 0.320   s/op

can we remove IntBackedDecimalBatchReader / LongBackedDecimalBatchReader / FixedLengthDecimalBatchReader / IntBackedDecimalPageReader / LongBackedDecimalPageReader?

Those methods/classes are public, is it safe to delete them? Or mark them as deprecated.

@nastra
Copy link
Contributor

nastra commented Mar 29, 2023

Those classes should be safe to delete. Also there are no API guarantees on iceberg-arrow, but just in case we could deprecate them and remove them in the next minor release, wdyt @jackye1995?

@jackye1995
Copy link
Contributor

Yes I agree, we can just mark it as deprecated and remove it after the next release just to follow the process

@ConeyLiu
Copy link
Contributor Author

Thanks, @nastra @jackye1995 for your time. Those public methods/classes are marked as deprecated.

@nastra
Copy link
Contributor

nastra commented Mar 30, 2023

@ConeyLiu I think we should also deprecate IntBackedDecimalPageReader / LongBackedDecimalPageReader

@ConeyLiu
Copy link
Contributor Author

@nastra updated, and also deprecate FixedLengthDecimalPageReader.

Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, @nastra do you have any further comments?

@jackye1995 jackye1995 merged commit 7c818c0 into apache:master Mar 31, 2023
@jackye1995
Copy link
Contributor

Thanks for the contribution! Thanks for the review @nastra !

@ConeyLiu
Copy link
Contributor Author

Thanks all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants