Skip to content

Improve managed-service liveness handling for OpenClaw startup stalls and freezes #160

@mostlydev

Description

@mostlydev

Problem

Live desks can leave an OpenClaw container stuck in a bad state where the process remains up, Docker does not restart it, and scheduled invocations begin accumulating failures.

Current failure chain:

  • OpenClaw startup can be slow enough that the scheduler reaches it before the runner is actually healthy.
  • claw-api currently treats wake execution as an immediate blocking action and records wake failures when the target is not yet healthy.
  • The OpenClaw driver's compose healthcheck and claw health probe shell into openclaw health --json, but a persistently unhealthy gateway does not automatically exit the container, so Docker restart policy never fires.
  • PostApply only checks that the container is running, not that the gateway has actually become healthy.

Scope

  1. Make claw-api health-aware before waking a managed service so startup lag does not count as a schedule failure.
  2. Improve OpenClaw startup verification so claw up -d waits for real liveness, not just a running PID.
  3. Add a liveness mechanism that turns persistent OpenClaw health failure into container exit so Docker restart policy can recover the service.
  4. Clarify ownership of runner-native cron state so agents do not treat OpenClaw-native cron as durable infrastructure state.

Likely files

  • cmd/claw-api/scheduler.go
  • cmd/claw-api/main.go or adjacent runtime loop code
  • internal/driver/openclaw/driver.go
  • internal/driver/openclaw/baseimage.go
  • internal/driver/shared/clawdapus_md.go

Verification

  • targeted Go tests for scheduler health-gating and OpenClaw post-apply/health behavior
  • integration/spike coverage showing startup lag is skipped rather than counted as failure
  • verification that a persistently unhealthy OpenClaw gateway causes container restart under the existing restart policy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions