[refactor(infra)] Restructure Grafana dashboards and harden production compose#15
Conversation
…alidation logging
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds Prometheus instrumentation and HTTP metrics middleware; instruments API key, rate-limiter, and contribution service; replaces two Grafana dashboards with three new ones; updates Makefile/cloud-init/Terraform to presign and provision new dashboards; and reworks docker-compose for production (adds Prometheus/Grafana/Caddy, WireGuard device caps). ChangesApplication Observability and Infrastructure
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/middleware/metrics.rs (1)
11-15: ⚖️ Poor tradeoffReduce allocations by extracting as borrowed strings.
The current approach allocates
methodandpathas ownedStrings early, which are then cloned for the histogram and moved into the counter—resulting in 3 unnecessary allocations per label. Since this middleware executes on every HTTP request, the overhead compounds.Consider extracting as
&strand letting the metric macros handle conversion:let method = req.method().as_str();This avoids the upfront allocation and clone.
⚡ Proposed optimization to reduce allocations
- let method = req.method().to_string(); - let path = req.extensions().get::<MatchedPath>().map_or_else( - || req.uri().path().to_string(), - |matched_path| matched_path.as_str().to_string(), - ); + let method = req.method().as_str(); + let path = req.extensions() + .get::<MatchedPath>() + .map(|m| m.as_str()) + .unwrap_or_else(|| req.uri().path()); let start = Instant::now(); let response = next.run(req).await; let latency = start.elapsed().as_secs_f64(); - let status = response.status().as_u16().to_string(); + let status = response.status().as_u16(); histogram!( "http_request_duration_seconds", - "method" => method.clone(), - "path" => path.clone(), - "status" => status.clone() + "method" => method, + "path" => path, + "status" => status.to_string() ) .record(latency); counter!( "http_requests_total", "method" => method, "path" => path, - "status" => status + "status" => status.to_string() ) .increment(1);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/middleware/metrics.rs` around lines 11 - 15, The code currently eagerly allocates Strings for method and path causing extra clones when used as labels; change the extraction to borrow &str from the request: use req.method().as_str() for method and use req.extensions().get::<MatchedPath>().map_or_else(|| req.uri().path(), |m| m.as_str()) for path so you pass &str into the histogram/counter macros (allowing the metric macros to convert to owned strings only when needed) while leaving identifiers method, path, MatchedPath, and req in place.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docker-compose.yml`:
- Line 37: The volume entry "Caddyfile:/etc/caddy/Caddyfile:ro" is using a named
volume instead of a bind mount; update the docker-compose service's volumes to
bind the local Caddyfile path (e.g., ./Caddyfile or the repo-relative path) to
/etc/caddy/Caddyfile:ro so Docker Compose mounts the actual file into the Caddy
container; locate the volumes list for the Caddy service (the line containing
"Caddyfile:/etc/caddy/Caddyfile:ro") and replace the left-hand side with the
correct host path (for example ./Caddyfile) to fix the mount.
- Line 48: Prometheus mount path is incorrect and PRODUCTION_DOMAIN is missing:
update the Prometheus volume mapping in the docker-compose service to point to
the actual file by removing the extra "prometheus/" segment in the source path
(the volume line referencing the Prometheus config under the Prometheus
service), and add the environment variable
PRODUCTION_DOMAIN=REPLACE_WITH_PRODUCTION_DOMAIN to .env.example (and ensure it
is set in .env) so the Grafana service can expand ${PRODUCTION_DOMAIN}; adjust
the docker-compose Grafana service environment if needed to reference
PRODUCTION_DOMAIN.
In `@Makefile`:
- Line 191: The Makefile contains inconsistent escaping in the tr -d command:
change the double-escaped backslash sequences ("tr -d '\\r'") to the
single-escaped form ("tr -d '\r'") so they match the earlier usages; update each
occurrence that constructs TF_VAR_API_HEALTH_DASHBOARD_URL (and the other two
identical presign/tr pipelines) to use tr -d '\r' to ensure consistent shell
behavior when stripping CR characters from the aws s3 presign output.
---
Nitpick comments:
In `@src/middleware/metrics.rs`:
- Around line 11-15: The code currently eagerly allocates Strings for method and
path causing extra clones when used as labels; change the extraction to borrow
&str from the request: use req.method().as_str() for method and use
req.extensions().get::<MatchedPath>().map_or_else(|| req.uri().path(), |m|
m.as_str()) for path so you pass &str into the histogram/counter macros
(allowing the metric macros to convert to owned strings only when needed) while
leaving identifiers method, path, MatchedPath, and req in place.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e4233bec-6076-4c40-8289-2fcdbe9814fa
📒 Files selected for processing (18)
Makefiledocker-compose.ymlinfra/common/cloud-init.templateinfra/common/grafana/provisioning/dashboards/api-health.jsoninfra/common/grafana/provisioning/dashboards/captcha-security.jsoninfra/common/grafana/provisioning/dashboards/domain-services.jsoninfra/common/grafana/provisioning/dashboards/security-overview.jsoninfra/common/grafana/provisioning/dashboards/ytdlp-health.jsoninfra/digitalocean/accounts/naduns-team/main.tfinfra/digitalocean/accounts/naduns-team/variables.tfinfra/digitalocean/components/locals.tfinfra/digitalocean/components/variables.tfsrc/app.rssrc/middleware/api_key.rssrc/middleware/metrics.rssrc/middleware/mod.rssrc/middleware/rate_limit.rssrc/services/contributions.rs
💤 Files with no reviewable changes (2)
- infra/common/grafana/provisioning/dashboards/ytdlp-health.json
- infra/common/grafana/provisioning/dashboards/captcha-security.json
There was a problem hiding this comment.
🧹 Nitpick comments (1)
Caddyfile (1)
10-12: ⚡ Quick winConsider adding
header -Serverfor consistency.Caddyfile.local includes
header -Serverbefore its redirect directive (line 15), but the production:80redirect block omits it. For consistency and defense-in-depth, consider adding it here as well.♻️ Suggested addition
:80 { + header -Server redir https://${PRODUCTION_DOMAIN}{uri} }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Caddyfile` around lines 10 - 12, The :80 redirect block in the Caddyfile currently only performs redirection (the block containing ":80 { redir https://${PRODUCTION_DOMAIN}{uri} }"); add the same header removal used in Caddyfile.local by inserting "header -Server" into this :80 block before the redir directive so the Server header is stripped for consistency and defense-in-depth.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Caddyfile`:
- Around line 10-12: The :80 redirect block in the Caddyfile currently only
performs redirection (the block containing ":80 { redir
https://${PRODUCTION_DOMAIN}{uri} }"); add the same header removal used in
Caddyfile.local by inserting "header -Server" into this :80 block before the
redir directive so the Server header is stripped for consistency and
defense-in-depth.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 909aae08-a97e-48be-ab49-301ab53010a2
📒 Files selected for processing (3)
CaddyfileCaddyfile.localinfra/common/cloud-init.template
🚧 Files skipped from review as they are similar to previous changes (1)
- infra/common/cloud-init.template
Description
Replaces the
ytdlp-healthandcaptcha-securityGrafana dashboards with three purpose-built dashboards:api-health,security-overview, anddomain-services. Updates the productiondocker-compose.ymlto enable Caddy, Prometheus, and Grafana as first-class services, and wires corresponding Terraform variables and cloud-init provisioning steps throughout the infra stack.Key additions:
api-health.json,security-overview.json, anddomain-services.jsonGrafana dashboards; removeytdlp-health.jsonandcaptcha-security.jsonmetricsmiddleware (track_http_metrics) recordinghttp_requests_totalandhttp_request_duration_secondsper method/path/statusapi_key.rs(auth_api_key_check_total) andrate_limit.rs(rate_limit_checks_totalwith tier/status labels)github_contributions_fetch_total,github_cache_size,github_api_duration_seconds) incontributions.rsdocker-compose.ymlwith proper env/volume bindingsYTDLP_DASHBOARD_URLandCAPTCHA_SECURITY_DASHBOARD_URLTerraform variables withAPI_HEALTH_DASHBOARD_URL,SECURITY_OVERVIEW_DASHBOARD_URL, andDOMAIN_SERVICES_DASHBOARD_URLacross all relevant.tffiles andMakefileTypes of changes
Checklist
.envupdated with new dashboard URL variable namesSummary by CodeRabbit
New Features
Updates
Removed