Skip to content

LLM Streaming Response Becomes so Slow in Vercel World #764

@sijawara

Description

@sijawara

Hallo i am using DurableAgent in local world and vercel world, in local world almost there is no problem. But when i deploy on vercel streaming response (tokens per second) become so slow:

"outputTokens": {
          "total": 1964,
          "text": 1658,
          "reasoning": 306
        },
"attempts": [
  {
          "provider": "xai",
           "internalModelId": "xai:grok-code-fast-1",
           "providerApiModelId": "grok-code-fast-1",
           "credentialType": "system",
           "success": true,
            "startTime": 360785.635555,
            "endTime": 361101.433652, <--  more than 3 minutes
            "statusCode": 200
   }]

This is my code:

"use workflow";

export const createPrepareStepCallback = (context: any) => {
  return async (agentParams: any) => {
    // simulate your deep sanitize / serialization safety logic
    const sanitizeStep = (step: any) => {
      if (!step) return step;
      const { model, ...rest } = step;
      return {
        ...rest,
        modelId: typeof model === "string" ? model : model?.modelId
      };
    };

    const serializableAgentParams = {
      stepNumber: agentParams.stepNumber,
      steps: agentParams.steps?.map(sanitizeStep),
      messages: agentParams.messages,
      model:
        typeof agentParams.model === "string"
          ? agentParams.model
          : agentParams.model?.modelId
    };


    await new Promise((r) => setTimeout(r, 50));

    return {
      ...context,
      serializableAgentParams
    };
  };
};

// ------------------------------------------------------------------
// This is the "workflow" function
// ------------------------------------------------------------------
export async function chatWorkflow(params: any) {
  const {
    preFetchedData,
    streamId
  }: {
    preFetchedData: any;
    streamId: string;
  } = params;

  const systemInstruction = preFetchedData.systemInstruction;
  const processedMessages = preFetchedData.processedMessages;

  const modelName = preFetchedData.participantConfig?.model ?? "openai/gpt-4.1-mini";

  // writable used by agent.stream
  const writable = getWritable<UIMessageChunk>();

  const prepareStep = createPrepareStepCallback({
    streamId,
    modelName
  });

  const agent = new DurableAgent({
    model: modelName as any,
    system: systemInstruction,
    temperature: 0.2,
    toolChoice: "auto",
    tools: {}
  });

  const startAt = performance.now();
  console.log("[START] streamId=", streamId, "model=", modelName);

  await agent.stream({
    messages: processedMessages,
    writable,
    maxSteps: 3,
    prepareStep: prepareStep as any,

    onStepFinish: async (stepResult: any) => {
      console.log(
        "\n[onStepFinish] finishReason=",
        stepResult?.finishReason ?? "unknown"
      );
    },

    onFinish: async () => {
      console.log(
        "\n[FINISH] total stream duration:",
        ((performance.now() - startAt) / 1000).toFixed(2),
        "s"
      );
    },

    onError: async ({ error }: { error: any }) => {
      console.error("\n[ERROR]", error);
    }
  });
}

dependencies:

    "@ai-sdk/react": "^3.0.18",
    "@ai-sdk/gateway": "^3.0.9",
    "ai": "^6.0.18",
    "@workflow/ai": "4.0.1-beta.43",

Anyone facing the same issue or there is something wrong with my code? Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions