Skip to content

Spike should verify compiled timezone reaches model-visible context #146

@mostlydev

Description

@mostlydev

Summary

Add an end-to-end spike that proves a compiled service timezone reaches model-visible context, alongside existing feed injection, instead of only relying on unit coverage.

Why this is separate from #135

Issue #135 fixes schedule/runtime compilation so pod-origin invokes and driver-native config stop hardcoding UTC.

This issue is about spike coverage across multiple already-existing features:

  • pod/service TZ resolution during claw up
  • cllama agent metadata emission (metadata.json)
  • cllama current-time injection (Current time: ...)
  • feed injection
  • session-history / Discord-visible model output

The risk here is integration drift even if the individual unit tests still pass.

Existing behavior worth leveraging

Current cllama code already injects a Current time: ... line into both OpenAI and Anthropic request context, using metadata.json.timezone first and then falling back to env / UTC.

Relevant code:

  • cmd/claw/compose_up.go compiles per-agent metadata including timezone
  • cllama/internal/proxy/time_context.go
  • cllama/internal/proxy/handler.go
  • cllama/internal/proxy/handler_test.go already covers feed + time injection at unit level

What is missing is an end-to-end spike proving the compiled timezone survives through the actual runtime path and is reflected in model-visible output.

Goal

Extend spike coverage so at least one live runtime verifies all of the following in a single session:

  1. compiled TZ is not lost between claw up and cllama context generation
  2. cllama injects current time using that timezone
  3. feed injection is still present in the same effective prompt
  4. the model's reply reflects the expected timezone rather than silently drifting back to UTC

Recommended shape

Do not overload the existing rollcall runtime-conformance assertions with brittle exact-clock matching.

Instead, add a focused spike phase or sibling spike that:

  • uses a single-service pod or a very small matrix (prefer one real runtime first)
  • sets an explicit non-UTC TZ such as America/New_York
  • enables a deterministic feed with known content
  • asks the agent to include both:
    • a feed-derived marker
    • its current local timezone / time context
  • asserts on stable signals:
    • feed marker is present
    • timezone is local (America/New_York, EDT, or EST) rather than UTC
    • optional: a second turn in the same session preserves the same timezone basis

Non-goals

  • Do not assert exact minute equality against wall-clock time unless the prompt is specifically engineered for exact echoing. That is likely to be flaky across providers.
  • Do not expand this into a full multi-runtime matrix immediately unless the single-runtime spike is stable.
  • Do not make Scheduler defaults pod-origin invokes to UTC instead of service TZ #135 depend on this spike to merge.

Acceptance criteria

  • New spike coverage exists for model-visible timezone context
  • The spike also exercises feed injection in the same request path
  • The test fails if timezone compilation regresses back to UTC
  • The test produces actionable failure output (response text and/or session-history excerpt)

Candidate implementation points

  • cmd/claw/spike_rollcall_test.go
  • examples/rollcall/
  • or a new dedicated spike fixture if keeping rollcall narrowly runtime-focused is cleaner

Open question

Whether to extend TestSpikeRollCall with a dedicated timezone/feed subtest, or introduce a sibling spike that reuses the same helper stack but has its own fixture and assertions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions