Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
== Physical Plan ==
TakeOrderedAndProject (40)
+- * Project (39)
+- * BroadcastHashJoin Inner BuildRight (38)
:- * Project (33)
: +- * BroadcastHashJoin Inner BuildRight (32)
: :- * Project (26)
: : +- * BroadcastHashJoin Inner BuildRight (25)
: : :- * Filter (10)
: : : +- * HashAggregate (9)
: : : +- Exchange (8)
: : : +- * HashAggregate (7)
: : : +- * Project (6)
: : : +- * BroadcastHashJoin Inner BuildRight (5)
: : : :- * ColumnarToRow (3)
: : : : +- CometFilter (2)
: : : : +- CometScan parquet spark_catalog.default.store_returns (1)
: : : +- ReusedExchange (4)
: : +- BroadcastExchange (24)
: : +- * Filter (23)
: : +- * HashAggregate (22)
: : +- Exchange (21)
: : +- * HashAggregate (20)
: : +- * HashAggregate (19)
: : +- Exchange (18)
: : +- * HashAggregate (17)
: : +- * Project (16)
: : +- * BroadcastHashJoin Inner BuildRight (15)
: : :- * ColumnarToRow (13)
: : : +- CometFilter (12)
: : : +- CometScan parquet spark_catalog.default.store_returns (11)
: : +- ReusedExchange (14)
: +- BroadcastExchange (31)
: +- * ColumnarToRow (30)
: +- CometProject (29)
: +- CometFilter (28)
: +- CometScan parquet spark_catalog.default.store (27)
+- BroadcastExchange (37)
+- * ColumnarToRow (36)
+- CometFilter (35)
+- CometScan parquet spark_catalog.default.customer (34)


(unknown) Scan parquet spark_catalog.default.store_returns
Output [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(sr_returned_date_sk#4), dynamicpruningexpression(sr_returned_date_sk#4 IN dynamicpruning#5)]
PushedFilters: [IsNotNull(sr_store_sk), IsNotNull(sr_customer_sk)]
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>

(2) CometFilter
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Condition : (isnotnull(sr_store_sk#2) AND isnotnull(sr_customer_sk#1))

(3) ColumnarToRow [codegen id : 2]
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]

(4) ReusedExchange [Reuses operator id: 45]
Output [1]: [d_date_sk#6]

(5) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [sr_returned_date_sk#4]
Right keys [1]: [d_date_sk#6]
Join type: Inner
Join condition: None

(6) Project [codegen id : 2]
Output [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Input [5]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4, d_date_sk#6]

(7) HashAggregate [codegen id : 2]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum#7]
Results [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]

(8) Exchange
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Arguments: hashpartitioning(sr_customer_sk#1, sr_store_sk#2, 5), ENSURE_REQUIREMENTS, [plan_id=1]

(9) HashAggregate [codegen id : 9]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#3))#9]
Results [3]: [sr_customer_sk#1 AS ctr_customer_sk#10, sr_store_sk#2 AS ctr_store_sk#11, MakeDecimal(sum(UnscaledValue(sr_return_amt#3))#9,17,2) AS ctr_total_return#12]

(10) Filter [codegen id : 9]
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12]
Condition : isnotnull(ctr_total_return#12)

(unknown) Scan parquet spark_catalog.default.store_returns
Output [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(sr_returned_date_sk#4), dynamicpruningexpression(sr_returned_date_sk#4 IN dynamicpruning#13)]
PushedFilters: [IsNotNull(sr_store_sk)]
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>

(12) CometFilter
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Condition : isnotnull(sr_store_sk#2)

(13) ColumnarToRow [codegen id : 4]
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]

(14) ReusedExchange [Reuses operator id: 45]
Output [1]: [d_date_sk#6]

(15) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [sr_returned_date_sk#4]
Right keys [1]: [d_date_sk#6]
Join type: Inner
Join condition: None

(16) Project [codegen id : 4]
Output [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Input [5]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4, d_date_sk#6]

(17) HashAggregate [codegen id : 4]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum#14]
Results [3]: [sr_customer_sk#1, sr_store_sk#2, sum#15]

(18) Exchange
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#15]
Arguments: hashpartitioning(sr_customer_sk#1, sr_store_sk#2, 5), ENSURE_REQUIREMENTS, [plan_id=2]

(19) HashAggregate [codegen id : 5]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#15]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#3))#9]
Results [2]: [sr_store_sk#2 AS ctr_store_sk#11, MakeDecimal(sum(UnscaledValue(sr_return_amt#3))#9,17,2) AS ctr_total_return#12]

(20) HashAggregate [codegen id : 5]
Input [2]: [ctr_store_sk#11, ctr_total_return#12]
Keys [1]: [ctr_store_sk#11]
Functions [1]: [partial_avg(ctr_total_return#12)]
Aggregate Attributes [2]: [sum#16, count#17]
Results [3]: [ctr_store_sk#11, sum#18, count#19]

(21) Exchange
Input [3]: [ctr_store_sk#11, sum#18, count#19]
Arguments: hashpartitioning(ctr_store_sk#11, 5), ENSURE_REQUIREMENTS, [plan_id=3]

(22) HashAggregate [codegen id : 6]
Input [3]: [ctr_store_sk#11, sum#18, count#19]
Keys [1]: [ctr_store_sk#11]
Functions [1]: [avg(ctr_total_return#12)]
Aggregate Attributes [1]: [avg(ctr_total_return#12)#20]
Results [2]: [(avg(ctr_total_return#12)#20 * 1.2) AS (avg(ctr_total_return) * 1.2)#21, ctr_store_sk#11 AS ctr_store_sk#11#22]

(23) Filter [codegen id : 6]
Input [2]: [(avg(ctr_total_return) * 1.2)#21, ctr_store_sk#11#22]
Condition : isnotnull((avg(ctr_total_return) * 1.2)#21)

(24) BroadcastExchange
Input [2]: [(avg(ctr_total_return) * 1.2)#21, ctr_store_sk#11#22]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [plan_id=4]

(25) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ctr_store_sk#11]
Right keys [1]: [ctr_store_sk#11#22]
Join type: Inner
Join condition: (cast(ctr_total_return#12 as decimal(24,7)) > (avg(ctr_total_return) * 1.2)#21)

(26) Project [codegen id : 9]
Output [2]: [ctr_customer_sk#10, ctr_store_sk#11]
Input [5]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12, (avg(ctr_total_return) * 1.2)#21, ctr_store_sk#11#22]

(unknown) Scan parquet spark_catalog.default.store
Output [2]: [s_store_sk#23, s_state#24]
Batched: true
Location [not included in comparison]/{warehouse_dir}/store]
PushedFilters: [IsNotNull(s_state), EqualTo(s_state,TN), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>

(28) CometFilter
Input [2]: [s_store_sk#23, s_state#24]
Condition : ((isnotnull(s_state#24) AND (s_state#24 = TN)) AND isnotnull(s_store_sk#23))

(29) CometProject
Input [2]: [s_store_sk#23, s_state#24]
Arguments: [s_store_sk#23], [s_store_sk#23]

(30) ColumnarToRow [codegen id : 7]
Input [1]: [s_store_sk#23]

(31) BroadcastExchange
Input [1]: [s_store_sk#23]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=5]

(32) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ctr_store_sk#11]
Right keys [1]: [s_store_sk#23]
Join type: Inner
Join condition: None

(33) Project [codegen id : 9]
Output [1]: [ctr_customer_sk#10]
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, s_store_sk#23]

(unknown) Scan parquet spark_catalog.default.customer
Output [2]: [c_customer_sk#25, c_customer_id#26]
Batched: true
Location [not included in comparison]/{warehouse_dir}/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string>

(35) CometFilter
Input [2]: [c_customer_sk#25, c_customer_id#26]
Condition : isnotnull(c_customer_sk#25)

(36) ColumnarToRow [codegen id : 8]
Input [2]: [c_customer_sk#25, c_customer_id#26]

(37) BroadcastExchange
Input [2]: [c_customer_sk#25, c_customer_id#26]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=6]

(38) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ctr_customer_sk#10]
Right keys [1]: [c_customer_sk#25]
Join type: Inner
Join condition: None

(39) Project [codegen id : 9]
Output [1]: [c_customer_id#26]
Input [3]: [ctr_customer_sk#10, c_customer_sk#25, c_customer_id#26]

(40) TakeOrderedAndProject
Input [1]: [c_customer_id#26]
Arguments: 100, [c_customer_id#26 ASC NULLS FIRST], [c_customer_id#26]

===== Subqueries =====

Subquery:1 Hosting operator id = 1 Hosting Expression = sr_returned_date_sk#4 IN dynamicpruning#5
BroadcastExchange (45)
+- * ColumnarToRow (44)
+- CometProject (43)
+- CometFilter (42)
+- CometScan parquet spark_catalog.default.date_dim (41)


(unknown) Scan parquet spark_catalog.default.date_dim
Output [2]: [d_date_sk#6, d_year#27]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>

(42) CometFilter
Input [2]: [d_date_sk#6, d_year#27]
Condition : ((isnotnull(d_year#27) AND (d_year#27 = 2000)) AND isnotnull(d_date_sk#6))

(43) CometProject
Input [2]: [d_date_sk#6, d_year#27]
Arguments: [d_date_sk#6], [d_date_sk#6]

(44) ColumnarToRow [codegen id : 1]
Input [1]: [d_date_sk#6]

(45) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=7]

Subquery:2 Hosting operator id = 11 Hosting Expression = sr_returned_date_sk#4 IN dynamicpruning#5


Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
TakeOrderedAndProject [c_customer_id]
WholeStageCodegen (9)
Project [c_customer_id]
BroadcastHashJoin [ctr_customer_sk,c_customer_sk]
Project [ctr_customer_sk]
BroadcastHashJoin [ctr_store_sk,s_store_sk]
Project [ctr_customer_sk,ctr_store_sk]
BroadcastHashJoin [ctr_store_sk,ctr_store_sk,ctr_total_return,(avg(ctr_total_return) * 1.2)]
Filter [ctr_total_return]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [sum(UnscaledValue(sr_return_amt)),ctr_customer_sk,ctr_store_sk,ctr_total_return,sum]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #1
WholeStageCodegen (2)
HashAggregate [sr_customer_sk,sr_store_sk,sr_return_amt] [sum,sum]
Project [sr_customer_sk,sr_store_sk,sr_return_amt]
BroadcastHashJoin [sr_returned_date_sk,d_date_sk]
ColumnarToRow
InputAdapter
CometFilter [sr_store_sk,sr_customer_sk]
CometScan parquet spark_catalog.default.store_returns [sr_customer_sk,sr_store_sk,sr_return_amt,sr_returned_date_sk]
SubqueryBroadcast [d_date_sk] #1
BroadcastExchange #2
WholeStageCodegen (1)
ColumnarToRow
InputAdapter
CometProject [d_date_sk]
CometFilter [d_year,d_date_sk]
CometScan parquet spark_catalog.default.date_dim [d_date_sk,d_year]
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #3
WholeStageCodegen (6)
Filter [(avg(ctr_total_return) * 1.2)]
HashAggregate [ctr_store_sk,sum,count] [avg(ctr_total_return),(avg(ctr_total_return) * 1.2),ctr_store_sk,sum,count]
InputAdapter
Exchange [ctr_store_sk] #4
WholeStageCodegen (5)
HashAggregate [ctr_store_sk,ctr_total_return] [sum,count,sum,count]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [sum(UnscaledValue(sr_return_amt)),ctr_store_sk,ctr_total_return,sum]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #5
WholeStageCodegen (4)
HashAggregate [sr_customer_sk,sr_store_sk,sr_return_amt] [sum,sum]
Project [sr_customer_sk,sr_store_sk,sr_return_amt]
BroadcastHashJoin [sr_returned_date_sk,d_date_sk]
ColumnarToRow
InputAdapter
CometFilter [sr_store_sk]
CometScan parquet spark_catalog.default.store_returns [sr_customer_sk,sr_store_sk,sr_return_amt,sr_returned_date_sk]
ReusedSubquery [d_date_sk] #1
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #6
WholeStageCodegen (7)
ColumnarToRow
InputAdapter
CometProject [s_store_sk]
CometFilter [s_state,s_store_sk]
CometScan parquet spark_catalog.default.store [s_store_sk,s_state]
InputAdapter
BroadcastExchange #7
WholeStageCodegen (8)
ColumnarToRow
InputAdapter
CometFilter [c_customer_sk]
CometScan parquet spark_catalog.default.customer [c_customer_sk,c_customer_id]
Loading