Skip to content

[GLUTEN-10214][VL] Merge inputstream for shuffle reader#10499

Merged
marin-ma merged 1 commit intoapache:mainfrom
marin-ma:shuffle-reader-merge-inpustream
Sep 7, 2025
Merged

[GLUTEN-10214][VL] Merge inputstream for shuffle reader#10499
marin-ma merged 1 commit intoapache:mainfrom
marin-ma:shuffle-reader-merge-inpustream

Conversation

@marin-ma
Copy link
Copy Markdown
Contributor

@marin-ma marin-ma commented Aug 20, 2025

The current design of Shuffle reading process reuses Spark's BlockStoreShuffleReader. The shuffle reader process each input stream individually with a dedicated SerializerInstance to deserialize the input data. This works fine when the output are rows. However, in Gluten we convert the deserialised row data into columnar batches as the output during sort-based shuffle read.

In real use cases, each input stream may only contain a small number of rows, The deserialisation time for sort-based shuffle reader can be very slow due to the small batch of row->column conversion. In this case, it's hard to tune the performance for the r2c process.

This patch adds a new ColumnarShuffleReader to create only one SerializerInstance for a reducer task that accepts deserialising all input streams. This allows the native reader to load all input streams so that it can do the r2c conversion with reading/accumulating a larger number of rows. This change can also eliminate the VeloxResizeBatches operation for sort-based shuffle read.

There are no benefits for hash-based shuffle reader and rss-sort shuffle reader.

Below charts demonstrate the before and after shuffle read process for this change.

Before:
image

After:
image

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Aug 20, 2025
@github-actions
Copy link
Copy Markdown

#10214

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

6 similar comments
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the shuffle-reader-merge-inpustream branch from 77780bd to b16faf5 Compare August 21, 2025 14:48
@marin-ma marin-ma changed the title [GLUTEN-10214][VL] WIP: Merge inputstream for shuffle reader [GLUTEN-10214][VL] Merge inputstream for shuffle reader Aug 21, 2025
@marin-ma marin-ma marked this pull request as ready for review August 21, 2025 14:48
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma
Copy link
Copy Markdown
Contributor Author

Perf result shows overall 2% improvement for TPCH/TPCDS sf500.

TPCH SF500

query before after perf = before/after
q01 32.33 33.20 0.97
q02 29.48 28.80 1.02
q03 55.61 54.63 1.02
q04 46.13 44.94 1.03
q05 96.68 97.12 1.00
q06 8.52 8.17 1.04
q07 54.69 54.44 1.00
q08 103.61 95.66 1.08
q09 155.53 145.61 1.07
q10 40.00 39.96 1.00
q11 22.34 21.62 1.03
q12 26.62 25.73 1.03
q13 34.58 34.71 1.00
q14 15.49 15.35 1.01
q15 37.55 37.43 1.00
q16 15.18 14.88 1.02
q17 146.48 145.88 1.00
q18 192.73 191.96 1.00
q19 15.50 15.01 1.03
q20 31.15 30.98 1.01
q21 215.52 217.53 0.99
q22 18.15 18.15 1.00
total 1393.88 1371.76 1.02

TPCDS SF500

query before after perf = before/after
q01 18.44 18.32 1.01
q02 21.88 19.25 1.14
q03 5.01 4.62 1.08
q04 73.11 71.71 1.02
q05 16.95 16.84 1.01
q06 4.09 3.85 1.06
q07 5.47 4.33 1.26
q08 3.35 3.11 1.08
q09 25.06 22.70 1.10
q10 6.83 6.79 1.01
q11 40.29 40.19 1.00
q12 1.84 1.95 0.94
q13 4.41 4.50 0.98
q14a 53.97 53.68 1.01
q14b 47.19 48.56 0.97
q15 4.40 4.67 0.94
q16 28.44 27.04 1.05
q17 7.22 7.09 1.02
q18 6.42 5.88 1.09
q19 3.52 3.49 1.01
q20 1.67 1.81 0.92
q21 1.49 1.44 1.03
q22 10.32 10.26 1.01
q23a 102.70 105.00 0.98
q23b 141.54 139.91 1.01
q24a 45.08 42.19 1.07
q24b 43.00 42.46 1.01
q25 6.04 6.21 0.97
q26 4.33 3.67 1.18
q27 3.74 3.73 1.00
q28 27.87 28.78 0.97
q29 13.63 13.62 1.00
q30 6.69 7.07 0.95
q31 20.98 20.52 1.02
q32 2.09 2.60 0.80
q33 4.40 4.36 1.01
q34 6.24 6.40 0.97
q35 11.84 11.71 1.01
q36 3.88 3.60 1.08
q37 4.40 5.00 0.88
q38 17.16 17.10 1.00
q39a 3.58 3.72 0.96
q39b 3.01 3.20 0.94
q40 8.95 6.46 1.39
q41 0.54 0.58 0.93
q42 1.24 1.01 1.23
q43 3.55 3.15 1.13
q44 11.23 12.88 0.87
q45 4.59 4.63 0.99
q46 6.85 7.22 0.95
q47 10.68 10.64 1.00
q48 4.66 4.90 0.95
q49 16.83 15.74 1.07
q50 31.93 33.16 0.96
q51 14.95 14.66 1.02
q52 1.35 1.29 1.04
q53 2.24 2.55 0.88
q54 6.83 6.69 1.02
q55 1.24 1.07 1.16
q56 3.56 3.54 1.01
q57 8.02 7.61 1.05
q58 2.68 2.84 0.94
q59 6.01 6.56 0.92
q60 4.44 4.16 1.07
q61 3.51 3.56 0.99
q62 4.61 4.14 1.11
q63 2.73 2.38 1.15
q64 43.39 42.78 1.01
q65 21.19 21.38 0.99
q66 6.33 6.34 1.00
q67 121.88 120.73 1.01
q68 6.10 5.84 1.04
q69 6.31 6.00 1.05
q70 7.66 7.76 0.99
q71 3.82 4.25 0.90
q72 37.63 37.26 1.01
q73 4.00 4.33 0.92
q74 30.56 28.75 1.06
q75 41.14 38.18 1.08
q76 21.14 18.39 1.15
q77 4.10 4.10 1.00
q78 52.81 53.43 0.99
q79 3.72 3.74 0.99
q80 18.43 17.38 1.06
q81 7.68 7.95 0.97
q82 5.68 4.69 1.21
q83 2.53 2.31 1.10
q84 4.91 5.23 0.94
q85 9.31 8.46 1.10
q86 2.91 2.58 1.13
q87 17.26 17.08 1.01
q88 27.29 26.71 1.02
q89 2.86 3.03 0.95
q90 7.79 6.99 1.11
q91 2.28 1.99 1.14
q92 2.39 1.77 1.35
q93 40.47 40.33 1.00
q94 17.52 16.89 1.04
q95 74.78 73.72 1.01
q96 5.57 5.22 1.07
q97 15.42 15.28 1.01
q98 2.60 2.36 1.10
q99 5.28 5.03 1.05
total 1711.53 1682.65 1.02

@jinchengchenghh
Copy link
Copy Markdown
Contributor

Why hash-based shuffle reader cannot benefit from it by eliminate VeloxResizeBatches?

extends ColumnarBatchSerializerInstance
with Logging {

private val runtime =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the runtime is only used to construct ShuffleReaderJniWrapper, do not create it here, wrap in jniWrapper

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also used when creating the output ColumnarBatchOutIterator

public:
ShuffleStreamReader(JNIEnv* env, jobject reader) {
if (env->GetJavaVM(&vm_) != JNI_OK) {
std::string errorMessage = "Unable to get JavaVM instance";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw GlutenException("Unable to get JavaVM instance")


namespace gluten {

class ShuffleReader;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need this line?


namespace gluten {

class TestStreamReader : public StreamReader {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a library to move code only used for tests and benchmarks out of the production source code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May add BUILD_TEST_UTILS like VELOX_BUILD_TEST_UTILS, and set it to ON if BUILD_TESTS = ON or BUILD_BENCHMARK=ON

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good idea. I will propose another pr for this refactor.

@marin-ma marin-ma force-pushed the shuffle-reader-merge-inpustream branch from b16faf5 to 8f4c974 Compare August 29, 2025 14:54
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the shuffle-reader-merge-inpustream branch from 8f4c974 to 95e3b9c Compare September 2, 2025 09:03
@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 2, 2025

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma force-pushed the shuffle-reader-merge-inpustream branch from 95e3b9c to 575086b Compare September 2, 2025 14:31
@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 2, 2025

Run Gluten Clickhouse CI on x86

@marin-ma marin-ma merged commit b7e44c2 into apache:main Sep 7, 2025
56 checks passed
warrenzhu25 pushed a commit to warrenzhu25/gluten that referenced this pull request Jan 10, 2026
Change-Id: Id2dd154e13529b1d295f04ada41f76d3e37feb8f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLICKHOUSE CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants