Conversation
RSO
approved these changes
Feb 4, 2026
iscekic
added a commit
that referenced
this pull request
Feb 6, 2026
## Summary Adds a new `cloudflare-o11y` Cloudflare Worker for LLM API observability, and wires the Kilo Gateway to emit metrics to it on every request. ## What it does ### New `cloudflare-o11y` worker - **Metrics ingestion** (`POST /ingest/api-metrics`): Accepts per-request metrics (provider, model, timing, status code, tokens, tools used, etc.) authenticated via Cloudflare Secrets Store. - **PostHog forwarding**: Captures every ingested metric as a PostHog event for analytics, forwarding the user's real IP for GeoIP resolution. - **Analytics Engine storage**: Writes each data point to Cloudflare Analytics Engine (provider, model, client, error flag, TTFB, complete request duration, status code) for time-series alerting queries. - **SLO-based alerting**: Cron-triggered (every minute) multi-window burn-rate alerting following [Google SRE Workbook approach #6](https://sre.google/workbook/alerting-on-slos/): - Three windows: 5m/1m page (14.4x), 30m/3m page (6x), 360m/30m ticket (1x) - Evaluates error rate (99.9% SLO) and latency (p50 at 5s, p90 at 15s) - Pages only fire for recommended models on `kilo-gateway`; everything else is downgraded to ticket - KV-based dedup with severity-aware suppression (pages suppress tickets for the same dimension) - Slack notifications via separate page/ticket webhooks ### Gateway integration (Next.js app) - The OpenRouter proxy route now emits metrics after each upstream response, including TTFB, tools available/used, token counts, user/org context, and IP address. - The response is cloned and drained asynchronously (via `after()`) to measure full request duration without blocking the client. - New `GET /api/recommended-models` endpoint (ISR, 1h revalidate) exposes the preferred models list for the o11y worker to determine page eligibility. ### CI/CD - New `deploy-o11y.yml` reusable workflow for manual and automated deploys. - Production deploy workflow auto-deploys the o11y worker when `cloudflare-o11y/**` changes. - Adds `fetch-depth: 2` to existing checkout steps so `dorny/paths-filter` works correctly. ## Changed files outside `cloudflare-o11y/` | File | Change | |------|--------| | `.github/workflows/deploy-o11y.yml` | New deploy workflow | | `.github/workflows/deploy-production.yml` | Auto-deploy o11y on prod, fix fetch-depth | | `src/app/api/openrouter/[...path]/route.ts` | Emit metrics on every proxied request | | `src/app/api/recommended-models/route.ts` | New endpoint for recommended models list | | `pnpm-workspace.yaml` | Register `cloudflare-o11y` as workspace package | | `pnpm-lock.yaml` | Lockfile update |
kilo-code-bot Bot
added a commit
that referenced
this pull request
Feb 22, 2026
…ation The old cloud-agent flow's processAnalysisStream() writes the raw markdown analysis content to the analysis field in security_findings on stream completion. This field powers the user-facing summary shown when clicking on an auto-dismissed finding. The callback-based cloud-agent-next flow must explicitly replicate this. Updates the plan to: - Add a 'Critical requirement' section under Background explaining the gap - Add step 4a in Phase 2.2 (handleAnalysisCompleted) to write raw markdown to the analysis field before running Tier 3 - Add risk item #6 about the analysis field migration gap - Add verification step for the analysis field in PR 2
jeanduplessis
pushed a commit
that referenced
this pull request
Feb 23, 2026
…ation The old cloud-agent flow's processAnalysisStream() writes the raw markdown analysis content to the analysis field in security_findings on stream completion. This field powers the user-facing summary shown when clicking on an auto-dismissed finding. The callback-based cloud-agent-next flow must explicitly replicate this. Updates the plan to: - Add a 'Critical requirement' section under Background explaining the gap - Add step 4a in Phase 2.2 (handleAnalysisCompleted) to write raw markdown to the analysis field before running Tier 3 - Add risk item #6 about the analysis field migration gap - Add verification step for the analysis field in PR 2
Open
42 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/admin/app-builder/[id])sessionIdURL parameter in Session Traces page for shareable links