Skip to content

fix: eliminate 10s container shutdown delay (api-proxy shell-form CMD + squid shutdown_lifetime) #1371

@Mossaka

Description

@Mossaka

Problem

Container shutdown takes ~10 seconds due to two independent issues affecting awf-api-proxy and awf-squid containers. Both containers hit Docker's default 10s SIGTERM grace period before being SIGKILL'd.

This was identified via performance analysis in github/gh-aw#21703. The 10s delay occurs twice per workflow run (once for the main agent execution, once for threat detection), costing ~20s total.

Root Cause Analysis

1. awf-api-proxy: Shell-form CMD (PR #1150 fix was accidentally reverted)

PR #1150 was supposed to fix this by switching from shell form to exec form in the Dockerfile. However, the second commit in that PR (829aef9) silently reverted the Dockerfile fix while adding tests. The merge commit 9b55519 only contains test changes — the actual fix was never shipped.

Current state (containers/api-proxy/Dockerfile:35):

CMD node server.js 2>&1 | tee -a /var/log/api-proxy/api-proxy.log

This makes /bin/sh PID 1, which doesn't forward SIGTERM to node. Docker waits 10s then SIGKILL's everything.

Fix: Change to exec form CMD ["node", "server.js"]

2. awf-squid: Missing shutdown_lifetime configuration

Squid's entrypoint correctly uses exec squid -N -d 1 (squid IS PID 1 and receives SIGTERM). However, Squid's default shutdown_lifetime is 30 seconds — it waits for active connections to drain before exiting. Docker's 10s grace period SIGKILL's squid before its 30s graceful shutdown completes.

shutdown_lifetime is not set anywhere in the generated squid config.

Fix: Add shutdown_lifetime 0 to generated squid.conf in src/squid-config.ts (ephemeral proxy, no need for connection draining).

3. No stop_grace_period in docker-compose config

stopContainers() in docker-manager.ts calls docker compose down -v with no --timeout flag, and no service has stop_grace_period set.

Fix: Add stop_grace_period: 2s to squid and api-proxy service definitions as a safety net.

Expected Improvement

~10s per AWF invocation, ~20s per workflow run (main agent + threat detection).

Files to Change

  • containers/api-proxy/Dockerfile — exec form CMD
  • src/squid-config.ts — add shutdown_lifetime 0
  • src/docker-manager.ts — add stop_grace_period to service definitions
  • Tests as needed

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions