LLM Streaming Response Becomes so Slow in Vercel World

Hallo i am using DurableAgent in local world and vercel world, in local world almost there is no problem. But when i deploy on vercel streaming response (tokens per second) become so slow:
```json
"outputTokens": {
          "total": 1964,
          "text": 1658,
          "reasoning": 306
        },
"attempts": [
  {
          "provider": "xai",
           "internalModelId": "xai:grok-code-fast-1",
           "providerApiModelId": "grok-code-fast-1",
           "credentialType": "system",
           "success": true,
            "startTime": 360785.635555,
            "endTime": 361101.433652, <--  more than 3 minutes
            "statusCode": 200
   }]
```

This is my code:
```typescript

"use workflow";

export const createPrepareStepCallback = (context: any) => {
  return async (agentParams: any) => {
    // simulate your deep sanitize / serialization safety logic
    const sanitizeStep = (step: any) => {
      if (!step) return step;
      const { model, ...rest } = step;
      return {
        ...rest,
        modelId: typeof model === "string" ? model : model?.modelId
      };
    };

    const serializableAgentParams = {
      stepNumber: agentParams.stepNumber,
      steps: agentParams.steps?.map(sanitizeStep),
      messages: agentParams.messages,
      model:
        typeof agentParams.model === "string"
          ? agentParams.model
          : agentParams.model?.modelId
    };


    await new Promise((r) => setTimeout(r, 50));

    return {
      ...context,
      serializableAgentParams
    };
  };
};

// ------------------------------------------------------------------
// This is the "workflow" function
// ------------------------------------------------------------------
export async function chatWorkflow(params: any) {
  const {
    preFetchedData,
    streamId
  }: {
    preFetchedData: any;
    streamId: string;
  } = params;

  const systemInstruction = preFetchedData.systemInstruction;
  const processedMessages = preFetchedData.processedMessages;

  const modelName = preFetchedData.participantConfig?.model ?? "openai/gpt-4.1-mini";

  // writable used by agent.stream
  const writable = getWritable<UIMessageChunk>();

  const prepareStep = createPrepareStepCallback({
    streamId,
    modelName
  });

  const agent = new DurableAgent({
    model: modelName as any,
    system: systemInstruction,
    temperature: 0.2,
    toolChoice: "auto",
    tools: {}
  });

  const startAt = performance.now();
  console.log("[START] streamId=", streamId, "model=", modelName);

  await agent.stream({
    messages: processedMessages,
    writable,
    maxSteps: 3,
    prepareStep: prepareStep as any,

    onStepFinish: async (stepResult: any) => {
      console.log(
        "\n[onStepFinish] finishReason=",
        stepResult?.finishReason ?? "unknown"
      );
    },

    onFinish: async () => {
      console.log(
        "\n[FINISH] total stream duration:",
        ((performance.now() - startAt) / 1000).toFixed(2),
        "s"
      );
    },

    onError: async ({ error }: { error: any }) => {
      console.error("\n[ERROR]", error);
    }
  });
}

```

dependencies:
```json
    "@ai-sdk/react": "^3.0.18",
    "@ai-sdk/gateway": "^3.0.9",
    "ai": "^6.0.18",
    "@workflow/ai": "4.0.1-beta.43",
```

Anyone facing the same issue or there is something wrong with my code? Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Streaming Response Becomes so Slow in Vercel World #764

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

LLM Streaming Response Becomes so Slow in Vercel World #764

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions