Skip to content

Memory leak reported by Java Arrow on q12 of CometTPCHQuerySuite #336

@viirya

Description

@viirya

Describe the bug

#324 fixed a bug of CometShuffleExchangeExec's logical link, it changes query plans.

Due to the change, CometTPCHQuerySuite's q12 has test failure that is memory leak reported by Java Arrow:

- q12 *** FAILED *** (2 seconds, 244 milliseconds)
  java.lang.Exception: Expected "struct<[l_shipmode:string,high_line_count:bigint,low_line_count:bigint]>", but got "struct<[]>" Schema did not match
-- using default substitutions

select
	l_shipmode,
	sum(case
		when o_orderpriority = '1-URGENT'
			or o_orderpriority = '2-HIGH'
			then 1
		else 0
	end) as high_line_count,
	sum(case
		when o_orderpriority <> '1-URGENT'
			and o_orderpriority <> '2-HIGH'
			then 1
		else 0
	end) as low_line_count
from
	orders,
	lineitem
where
	o_orderkey = l_orderkey
	and l_shipmode in ('MAIL', 'SHIP')
	and l_commitdate < l_receiptdate
	and l_shipdate < l_commitdate
	and l_receiptdate >= date '1994-01-01'
	and l_receiptdate < date '1994-01-01' + interval '1' year
group by
	l_shipmode
order by
	l_shipmode
Output/Exception: java.lang.IllegalStateException
Memory was leaked by query. Memory leaked: (49152)
Allocator(ROOT) 0/49152/180352/9223372036854775807 (res/actual/peak/limit)
Error using configs:
spark.sql.autoBroadcastJoinThreshold=10485760

I spent some time on debugging it, and found it seems caused by native shuffle (CometTPCHQuerySuite uses native shuffle for now). During debugging, I found that the leak is occurred on the allocation in StreamReader. The read batch is correctly closed after being used. But there is still 49152 bytes cannot be released on the allocator.

I'm not sure if it is a bug of Java Arrow.

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions