-
Notifications
You must be signed in to change notification settings - Fork 4k
Open
Description
For internal development purposes and for bug reports and highly technical users it will be extremely helpful if we can turn on and off a profiling log.
I think initially:
- We may want the ability to entirely disable logging with a compile-time variable (we may already have this since OT can be entirely disabled)
- Enabling or disabling the log is controlled by an environment variable (assuming we built with logging)
- The log will initially just dump to stdout or a flat file (we can also or will also someday export via OT)
- Some initial metrics include "run time of node", "run time of kernel function", "run time of query"
- It would be nice if we could get some regular statistics as well such as the allocated bytes of the memory pools, RSS consumed by the process, etc. Perhaps that would require two files.
Reporter: Weston Pace / @westonpace
Subtasks:
- [C++] Add logging for kernel functions and exec plan nodes
- [C++] Add regular logging of exec plan performance metrics
- [C++][Tools] Create visualization tool for exec plan tracing logs
- [C++] Add OT spans for the scanner
- [C++][R][Python] Update ExecPlan bindings
- [C++] Add rows scanned to open telemetry / profiling
- [C++] Dump OpenTelemetry profiling summary to stdout
- [Tools][Docs] Add instructions on how to collect the produced telemetry data
Related issues:
- [C++] Add profiling / tracing for exec plan (relates to)
- [C++] Add simple stdout/JSON exporter for OpenTelemetry (depends upon)
Note: This issue was originally created as ARROW-15059. Please see the migration documentation for further details.