The tpch benchmark runtime seems to be dominated by csv parsing code and it is really difficult to see any performance hotspots related to actual query execution in a flamegraph.
With the date in memory and more iterations it should be easier to profile and find bottlenecks.
Reporter: Jörn Horstmann / @jhorstmann
Assignee: Jörn Horstmann / @jhorstmann
PRs and other links:
Note: This issue was originally created as ARROW-10240. Please see the migration documentation for further details.
The tpch benchmark runtime seems to be dominated by csv parsing code and it is really difficult to see any performance hotspots related to actual query execution in a flamegraph.
With the date in memory and more iterations it should be easier to profile and find bottlenecks.
Reporter: Jörn Horstmann / @jhorstmann
Assignee: Jörn Horstmann / @jhorstmann
PRs and other links:
Note: This issue was originally created as ARROW-10240. Please see the migration documentation for further details.