ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes #12100

mbrobbel · 2022-01-07T21:14:24Z

Adds spans and events for exec plan and exec nodes.

I use the following setup to debug traces:

docker-compose.yml:

version: "2"
services:

  jaeger-all-in-one:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"
      - "14268"
      - "14250"

  otel-collector:
    image: otel/opentelemetry-collector-contrib-dev:latest
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
      - ./output:/var/output
    ports:
      - "8888:8888"   # Prometheus metrics exposed by the collector
      - "8889:8889"   # Prometheus exporter metrics
      - "4317"        # OTLP gRPC receiver
      - "4318:4318"   # OTLP HTTP receiver
    depends_on:      
      - jaeger-all-in-one

  prometheus:
    container_name: prometheus
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yaml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"

otel-collector-config.yaml:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: "0.0.0.0:4318"
        cors:
          allowed_origins:
            - "*"

processors:
  batch:

exporters:
  logging:
  jaeger:
    endpoint: jaeger-all-in-one:14250
    tls:
      insecure: true
  prometheus:
    endpoint: 0.0.0.0:8889
  parquet:
    path: /var/output/log.parquet

service:
  pipelines:
    traces:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - logging
        - jaeger
        - parquet
    metrics:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - logging
        - prometheus
        - parquet

prometheus.yaml:

scrape_configs:
  - job_name: "otel-collector"
    scrape_interval: 10s
    static_configs:
      - targets: ["otel-collector:8889"]
      - targets: ["otel-collector:8888"]

Start the services and then use the instructions from #11906.

github-actions · 2022-01-07T21:14:44Z

https://issues.apache.org/jira/browse/ARROW-15061

lidavidm · 2022-01-07T21:30:07Z

Cool! The macro definitions look fairly useful.

lidavidm · 2022-01-07T21:30:48Z

If you have a screenshot or any quick example of the output to share here it would also be useful, I think.

mbrobbel · 2022-01-10T15:28:26Z

Some example output in Jaeger:
TPC-H query 1:

TPC-H query 2:

…d_ptr<Scalar>&)

lidavidm

Thanks, this looks good overall. I left some small comments.

cpp/src/arrow/util/tracing.h

cpp/src/arrow/compute/exec/exec_plan.h

cpp/src/arrow/compute/exec/exec_plan.cc

cpp/src/arrow/compute/exec/aggregate_node.cc

cpp/src/arrow/compute/exec/hash_join.h

cpp/src/arrow/compute/function.cc

cpp/src/arrow/compute/exec/union_node.cc

cpp/src/arrow/compute/exec/aggregate_node.cc

… and plans

lidavidm

Thanks for doing this. Once we get this in, I'm pretty excited to see what sorts of analysis/visualization we can do.

lidavidm · 2022-01-13T14:28:32Z

cpp/src/arrow/util/tracing_internal.h

+        MARK_SPAN(target_span, st);                                               \
+        END_SPAN(target_span);                                                    \
+        return st;                                                                \
+      })


This macro really makes me think we should consider just adding Span as a part of Future (and see how it impacts performance). That can be done later, though, I think we can get this in first and continue refining how we use OpenTelemetry.

cpp/src/arrow/compute/exec/aggregate_node.cc

cpp/src/arrow/compute/exec/sink_node.cc

westonpace · 2022-02-01T20:57:13Z

@lidavidm @mbrobbel Is there any reason we can't merge this? If you want I can take a look through tomorrow and merge. Otherwise I took a quick scan and I think this is helpful information to get us started with instrumentation.

mbrobbel · 2022-02-01T21:02:49Z

@lidavidm @mbrobbel Is there any reason we can't merge this? If you want I can take a look through tomorrow and merge. Otherwise I took a quick scan and I think this is helpful information to get us started with instrumentation.

There are still some comments from @lidavidm that I need to address, but I can do that tomorrow.

lidavidm · 2022-02-01T22:43:02Z

No further comments from me - let's merge once OrderBySinkNode is fully instrumented.

lidavidm · 2022-02-01T22:43:24Z

I want to base #11964 on top of this since the helpers here will simplify it.

mbrobbel · 2022-02-02T09:21:14Z

@westonpace @lidavidm this is now ready for a final review.

lidavidm

Thanks for working through this!

lidavidm · 2022-02-02T14:38:47Z

(For the record, edited description to remove username pings; those get put into the commit message and then generate a lot of notification spam.)

ursabot · 2022-02-02T14:41:21Z

Benchmark runs are scheduled for baseline = 74deb45 and contender = 9b53235. 9b53235 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.36%] ursa-i9-9960x
[Finished ⬇️0.69% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

lidavidm and others added 3 commits December 17, 2021 10:10

WIP: [C++] Demonstrate C++ OTel instrumentation

ab65be0

Instrument execution plans and nodes

8244a23

Use pointer to implementation to allow span members on public types

8b54976

github-actions bot added the Component: C++ label Jan 7, 2022

lidavidm mentioned this pull request Jan 7, 2022

ARROW-15067: [C++] Add tracing spans to the scanner #11964

Closed

mbrobbel added 3 commits January 10, 2022 16:00

Rename Span member functions to Set and Get

635f825

Remove options from attributes in Function::Execute span

31a31f7

Handle invalid JoinType variants in ToString

0e229e6

mbrobbel added 3 commits January 11, 2022 20:47

Add ARROW_EXPORT to Span

3fc407c

Handle nullptr in compute::internal::GenericToString(const std::share…

d6b5d0d

…d_ptr<Scalar>&)

Modify Function::Execute span name and attributes

68f9ec1

mbrobbel marked this pull request as ready for review January 12, 2022 08:44

lidavidm reviewed Jan 12, 2022

View reviewed changes

Update spans and attributes

881b533

lidavidm reviewed Jan 12, 2022

View reviewed changes

Use Then continuation instead of callbacks to end spans on exec nodes…

7085b84

… and plans

lidavidm approved these changes Jan 13, 2022

View reviewed changes

Add function.name attribute for function spans

9fc1543

lidavidm approved these changes Jan 17, 2022

View reviewed changes

cpp/src/arrow/compute/exec/aggregate_node.cc Show resolved Hide resolved

lidavidm reviewed Jan 17, 2022

View reviewed changes

cpp/src/arrow/compute/exec/sink_node.cc Show resolved Hide resolved

Add metadata to ExecPlan

b3395b2

Add plan metadata as span attributes

79dd6b6

Instrument OrderBySinkNode::Finish

4c0d944

lidavidm approved these changes Feb 2, 2022

View reviewed changes

lidavidm closed this in 9b53235 Feb 2, 2022

This was referenced Mar 22, 2022

[C++] Add logging for kernel functions and exec plan nodes #30576

Closed

[C++] Investigate potential performance improvements for the filter node #30992

Open

ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes #12100

ARROW-15061: [C++] Add logging for kernel functions and exec plan nodes #12100

Uh oh!

Conversation

mbrobbel commented Jan 7, 2022 • edited by lidavidm Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 7, 2022

Uh oh!

lidavidm commented Jan 7, 2022

Uh oh!

lidavidm commented Jan 7, 2022

Uh oh!

mbrobbel commented Jan 10, 2022

Uh oh!

lidavidm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lidavidm left a comment

Choose a reason for hiding this comment

Uh oh!

lidavidm Jan 13, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

westonpace commented Feb 1, 2022

Uh oh!

mbrobbel commented Feb 1, 2022

Uh oh!

lidavidm commented Feb 1, 2022

Uh oh!

lidavidm commented Feb 1, 2022

Uh oh!

mbrobbel commented Feb 2, 2022

Uh oh!

lidavidm left a comment

Choose a reason for hiding this comment

Uh oh!

lidavidm commented Feb 2, 2022

Uh oh!

ursabot commented Feb 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mbrobbel commented Jan 7, 2022 •

edited by lidavidm

Loading

ursabot commented Feb 2, 2022 •

edited

Loading