Skip to content

[Bug] High CPU usage with OTel instrumentation when updating to v1.13.x #1859

@daniellockyer

Description

@daniellockyer

What are you really trying to do?

Hi! 👋🏻

We have a project using the Temporal TS SDK, and previously the code was running v1.11.8 of all Temporal packages. We had OTel instrumentation enabled (and working!) via the @temporalio/interceptors-opentelemetry package.

Upon updating from v1.11.8 to v1.13.2, we started to see excessive CPU usage, sometimes up to 100%.

We have a different project that we bumped to v1.13.2 and didn't see excess CPU usage. The only key difference between these projects is that one has OTel enabled, and the other doesn't.

Removing all uses of @temporalio/interceptors-opentelemetry resolves the CPU issue, and it's back down to the level we were previously at. This implies to me what it's either a bug in @temporalio/interceptors-opentelemetry, or a bug in our implementation/use of it.

I did a CPU profile via node inspector on one of the production instances, and couldn't see anything in the trace. I don't know if this means it's in one of the native dependencies, or V8 contexts or what.

I looked at most commits between the two releases. There are some referencing OTel, but none that I'd expect to bump CPU usage up to 100%. I do not see any Temporal SDK errors in our logs.

Obviously, we'd like to keep instrumentation, so it'd be great to find the cause of this.

Describe the bug

Image

We updated the packages on the 26th (the rise) and disabled OTel on the 2nd (the drop).

Minimal Reproduction

I'm currently unable to find a minimal reproduction, but here is the diff I have that would re-enable OTel instrumentation on our worker, and thus re-introduce the CPU increase. (I've had to trim out our code around it)

diff --git a/xxx/index.ts b/xxx/index.ts
index 326f62c921..bdd1a99893 100644
--- a/xxx/index.ts
+++ b/xxx/index.ts
@@ -1,4 +1,9 @@
 import path from 'path';
+import {
+  makeWorkflowExporter,
+  OpenTelemetryActivityInboundInterceptor,
+  OpenTelemetryActivityOutboundInterceptor,
+} from '@temporalio/interceptors-opentelemetry/lib/worker';
 import {
   NativeConnection,
   NativeConnectionOptions,
@@ -12,7 +17,7 @@ import { env } from '@/lib/env';
...
-import { otelSdk } from './lib/instrumentation';
+import { otelSdk, resource, traceExporter } from './lib/instrumentation';
 
 const getConnection = async (): Promise<NativeConnection> => {
   const connectionOptions: NativeConnectionOptions = {
@@ -76,6 +81,9 @@ const getWorker = async (connection: NativeConnection): Promise<Worker> => {
...
+    sinks: {
+      exporter: makeWorkflowExporter(traceExporter, resource),
+    },
   };
   if (env.NODE_ENV !== 'development') {
     options.workflowBundle = {
@@ -93,6 +101,15 @@ const getWorker = async (connection: NativeConnection): Promise<Worker> => {
         return config;
       },
     };
+    options.interceptors = {
+      workflowModules: [require.resolve('./workflows')],
+      activity: [
+        (ctx) => ({
+          inbound: new OpenTelemetryActivityInboundInterceptor(ctx),
+          outbound: new OpenTelemetryActivityOutboundInterceptor(ctx),
+        }),
+      ],
+    };
   }
   return Worker.create(options);
 };
diff --git a/xxx/build-workflow-bundle.ts b/xxx/build-workflow-bundle.ts
index 212e8113b2..2a53bf8c1a 100644
--- a/xxx/build-workflow-bundle.ts
+++ b/xxx/build-workflow-bundle.ts
@@ -6,6 +6,7 @@ import { bundleWorkflowCode } from '@temporalio/worker';
 async function bundle() {
   const { code } = await bundleWorkflowCode({
     workflowsPath: require.resolve('../workflows'),
+    workflowInterceptorModules: [require.resolve('../workflows/flow.ts')],
     webpackConfigHook: (config) => {
       config.resolve = config.resolve || {};
       config.resolve.alias = {

diff --git a/xxx/flow.ts b/xxx/flow.ts
index bb17ceb9db..fd366e2d88 100644
--- a/xxx/flow.ts
+++ b/xxx/flow.ts
@@ -1,4 +1,9 @@
...
+import {
+  OpenTelemetryInboundInterceptor,
+  OpenTelemetryInternalsInterceptor,
+  OpenTelemetryOutboundInterceptor,
+} from '@temporalio/interceptors-opentelemetry/lib/workflow';
 import * as workflow from '@temporalio/workflow';
 
@@ -40,6 +45,12 @@ ...
...
 
+export const interceptors: workflow.WorkflowInterceptorsFactory = () => ({
+  inbound: [new OpenTelemetryInboundInterceptor()],
+  outbound: [new OpenTelemetryOutboundInterceptor()],
+  internals: [new OpenTelemetryInternalsInterceptor()],
+});
+
...

Environment/Versions

  • OS and processor: Linux
  • Temporal Version: v1.13.2
  • We're using the Temporal TS/JS packages for the projects, but also Temporal cloud

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions