Skip to content

[refactor(config)] Harden config loading, enforce anti-corruption layer, and optimize service internals#12

Merged
nxdun merged 18 commits into
mainfrom
nadun/contribution-caching
Apr 27, 2026
Merged

[refactor(config)] Harden config loading, enforce anti-corruption layer, and optimize service internals#12
nxdun merged 18 commits into
mainfrom
nadun/contribution-caching

Conversation

@nxdun
Copy link
Copy Markdown
Owner

@nxdun nxdun commented Apr 26, 2026

Description

This PR refactors the backend to enforce architectural integrity across config, models, middleware, and services. AppConfig::from_env() now returns Result<Self, ConfigError> instead of panicking, and the master API key is made private with constant-time validation via check_api_key. Internal domain models are separated from external DTOs, and zero-allocation patterns using Cow<'static, str> are applied throughout.

Key additions:

  • Refactor AppConfig::from_env() to return Result<Self, ConfigError> and handle failure gracefully in app.rs
  • Add ConfigError typed enum for missing/invalid env vars, removing all process::exit calls from config logic
  • Make master_api_key private; centralize key validation in check_api_key using constant_time_eq
  • Separate ytdlp and contributions domain models from DTOs (ytdlp.rs/ytdlp_dto.rs, contributions.rs, github_dto.rs)
  • Add YtdlpJobResponse to filter internal fields (output_dir, format_flag) from API responses
  • Add AppError::UpstreamError variant mapping to 502 Bad Gateway; reclassify captcha and GitHub errors
  • Replace static strings and heap-allocated labels with Cow<'static, str> across models and services
  • Centralize X_API_KEY and X_CAPTCHA_TOKEN as typed HeaderName constants in middleware/mod.rs
  • Optimize transform_calendar to single-pass legend min/max calculation with pre-allocated Vec
  • Add GEMINI.md engineering standards document covering architecture, security, performance, and workflow

Types of changes

  • Refactor

Checklist

  • Updated ChangeLog
  • All existing tests updated to use new AppConfig::new() constructor
  • Unit test updated to assert from_env() returns Err when MASTER_API_KEY is missing

Summary by CodeRabbit

  • New Features

    • Standardized env formatting; new runtime config vars (download, GitHub GraphQL, licensing, S3); typed DTOs for downloads and GitHub responses.
  • Bug Fixes

    • Startup fails on invalid config; structured validation errors; upstream errors surface with appropriate HTTP codes; improved contributions caching/TTL.
  • Security

    • Secrets masked in diagnostics; constant-time API key checks and safer header handling.
  • Documentation

    • Added engineering standards and API/docs clarifications.
  • Tests

    • Updated tests to reflect new config, DTOs, and response shapes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Config loading becomes fallible with masked sensitive fields and accessors; introduces DTO/domain separation for GitHub and yt-dlp, moves public yt-dlp API to DTOs, maps upstream failures to a 502 AppError variant, centralizes header constants and constant-time checks, converts many Strings to Cow, and extends environment / Terraform variables.

Changes

Cohort / File(s) Summary
Environment & Infra
\.env.example
Reformats variables to KEY=value, replaces ytdlp-specific vars with DOWNLOAD_DIR / TF_VAR_DOWNLOAD_DIR, adds MASTER_API_KEY, GITHUB_PAT, GITHUB_USERNAME, GITHUB_GRAPHQL_URL, WARP_LICENSE_KEY, TF_VAR_MASTER_API_KEY, TF_VAR_WARP_LICENSE_KEY, and AWS_S3_BUCKET_NAME.
Engineering Standards
GEMINI.md
Adds engineering standards: DTO/domain separation, controller/service responsibilities, performance/security rules, typed config loading, AppError patterns, and tooling/doc expectations.
Config & App Init
src/config.rs, src/app.rs
AppConfig::from_env()Result<..., ConfigError>; new ConfigError enum; AppConfig::new(...); sensitive fields made private with accessors (github_pat(), captcha_secret_key()); check_api_key() added; app exits on config load failure.
Models Index & DTOs
src/models/mod.rs, src/models/github_dto.rs, src/models/ytdlp_dto.rs
Renames exports to drop _model, adds github_dto and ytdlp_dto, and introduces typed GitHub and yt-dlp DTO structs plus From<YtdlpJob> conversion.
Domain Models
src/models/contributions.rs, src/models/validation.rs, src/models/ytdlp.rs
Converts several String fields to Cow<'static, str>; removes public yt-dlp request/response types from domain model; expands YtdlpJobStatus derives to include Deserialize/Copy/Eq.
Controllers & Routes
src/controllers/..., src/routes/api/v1/...
Controllers/imports updated to new model/DTO modules; ytdlp endpoints return YtdlpJobResponse DTOs (SSE and serialization adjusted); contributions router renamed to router; handler docs added.
Services
src/services/contributions.rs, src/services/ytdlp/mod.rs
ContributionsService: typed GitHub GraphQL request/response usage, cache TTL recalculation, single-pass calendar transform using Cow, upstream failures map to UpstreamError. Ytdlp manager: resolve_output_dir now returns AppError, reduced allocations in progress/redaction, refined error storage.
Middleware & Shared Helpers
src/middleware/mod.rs, src/middleware/api_key.rs, src/middleware/auth.rs, src/middleware/cors.rs, src/middleware/captcha.rs, src/middleware/rate_limit.rs
Adds exported header constants X_API_KEY/X_CAPTCHA_TOKEN and shared constant_time_eq; middleware delegates key checks to config.check_api_key(); captcha upstream failures map to UpstreamError/ServiceUnavailable; CORS uses header constants; rate limiter uses NonZeroU32::new(...).expect(...).
Error Handling & Extractors
src/error.rs, src/extractors/validated_json.rs
Adds AppError::UpstreamError(String) mapped to HTTP 502 with logging; ErrorResponse uses Cow<'static, str>; validation extractor serializes structured error map to JSON with safe fallback.
Tests & Test Helpers
tests/..., tests/api/common.rs, tests/layer_unit_tests.rs
Tests updated for module/DTO renames, use AppConfig::new(...) in fixtures, use github_pat() accessor, assert AppConfig::from_env() returns Err when MASTER_API_KEY missing, and adjust ytdlp tests to expect 503 and new DTO shapes.
Small Controllers/Helpers
src/controllers/health_controller.rs, src/controllers/validation_controller.rs, src/controllers/api/v1/contributions_controller.rs, src/controllers/api/v1/ytdlp_controller.rs
Minor import and DTO usage changes, replace some owned Strings with Cow::Borrowed, refine SSE/file-serving serialization and error construction, add docs to handlers.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Controller as ContributionsController
  participant Service as ContributionsService
  participant Cache as Cache
  participant GitHub as GitHub GraphQL API

  Client->>Controller: GET /api/v1/contributions?username?
  Controller->>Service: get_contributions(username?)
  Service->>Cache: lookup(username)
  alt cache hit
    Cache-->>Service: cached ContributionsResponse
    Service-->>Controller: ContributionsResponse
    Controller-->>Client: 200 JSON
  else cache miss
    Service->>GitHub: POST GithubGqlRequest (typed DTO)
    GitHub-->>Service: 200 + data / errors
    alt success with data
      Service->>Service: transform calendar -> ContributionsResponse (single-pass, Cow)
      Service->>Cache: insert(response, ttl = min(3h, until_midnight))
      Service-->>Controller: ContributionsResponse
      Controller-->>Client: 200 JSON
    else upstream failure
      Service-->>Controller: AppError::UpstreamError
      Controller-->>Client: 502 JSON
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐇
I nibble envs and guard keys tight,
DTOs hop in with tidy sight,
Cache at dusk, no needless clone,
Constant-time checks keep secrets known,
A rabbit’s patch — clean, quick, and bright.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main refactoring changes: hardening config loading, enforcing anti-corruption layers (DTO separation), and optimizing service internals—all core objectives of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nadun/contribution-caching

Comment @coderabbitai help to get the list of available commands and usage tips.

@nxdun
Copy link
Copy Markdown
Owner Author

nxdun commented Apr 26, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
src/routes/api/v1/mod.rs (1)

9-12: Add Rustdoc to the public router constructor.

pub fn router(...) is public-facing and should have a /// doc comment describing intent/contract.

Suggested patch
 #[allow(unreachable_pub)]
+/// Builds API v1 routes and mounts feature sub-routers.
 pub fn router(state: AppState) -> Router<AppState> {
     Router::new()
         .merge(ytdlp_routes::router(state.clone()))
         .nest("/api/v1/contributions", contributions_routes::router(state))
 }

As per coding guidelines "All public-facing methods and services must have /// (Rustdoc) comments explaining intent and behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/api/v1/mod.rs` around lines 9 - 12, Add a Rustdoc comment for the
public constructor pub fn router(state: AppState) -> Router<AppState> that
explains its intent and contract: describe what routes are exposed (it merges
ytdlp_routes::router and nests contributions_routes::router under
"/api/v1/contributions"), document the state parameter (AppState) and
ownership/clone behavior, and state what the returned Router<AppState>
represents and any invariants (e.g., path prefixing or required middleware).
Ensure the comment starts with /// immediately above the function signature.
tests/api/validation_tests.rs (1)

63-63: Strengthen this assertion to validate structure, not a substring.

This check is a bit permissive and may pass for the wrong reason. Prefer asserting the validation payload shape/content instead of only checking that "messages" appears in text.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/validation_tests.rs` at line 63, The current assertion only checks
for the substring "messages"; instead assert the JSON structure explicitly by
verifying body["message"] is an object and contains a "messages" key, that
body["message"]["messages"] is an array (non-empty), and that elements have the
expected fields (e.g., check the first element has specific keys or types).
Update the test in validation_tests.rs to replace the contains(...) check with
assertions using body["message"].is_object(),
body["message"]["messages"].is_array(), an assert on messages array length > 0,
and one or two asserts on the first element's keys/types to validate payload
shape.
src/middleware/cors.rs (1)

53-54: Consider pre-allocating the matchers vector.

The capacity is known from raw_origins.len().

♻️ Suggested improvement
     // PRE-COMPUTE MATCHERS (Runs only once on startup)
-    let mut matchers = Vec::new();
+    let mut matchers = Vec::with_capacity(raw_origins.len());
     for allowed in raw_origins {

As per coding guidelines: "Pre-allocate vector capacities when the size is known or estimable."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/middleware/cors.rs` around lines 53 - 54, The matchers vector is grown
dynamically but its final size equals raw_origins.len(); change the
initialization of matchers from let mut matchers = Vec::new(); to pre-allocate
capacity using Vec::with_capacity(raw_origins.len()) so the loop that pushes
into matchers (referencing the matchers variable and raw_origins iterator)
avoids reallocations and follows the pre-allocation guideline.
tests/api/contributions_tests.rs (1)

82-87: Minor: Redundant fallbacks on values that are always Some.

Since github_pat (line 74) and github_username (line 75) are explicitly set to Some(...), the unwrap_or_default() and unwrap_or_else() fallbacks will never execute. Consider using .unwrap() or .expect() here to make the test's intent clearer—these are test-controlled values that should always be present.

♻️ Suggested simplification
     let contributions_service = Arc::new(ContributionsService::new(
         http_client.clone(),
-        config.github_pat.clone().unwrap_or_default(),
-        config
-            .github_username
-            .clone()
-            .unwrap_or_else(|| "nxdun".to_string()),
+        config.github_pat.clone().expect("github_pat set in test config"),
+        config.github_username.clone().expect("github_username set in test config"),
         config.github_graphql_url.clone(),
     ));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/contributions_tests.rs` around lines 82 - 87, The test is using
redundant fallbacks: replace config.github_pat.clone().unwrap_or_default() and
config.github_username.clone().unwrap_or_else(|| "nxdun".to_string()) with
direct unwrapping to reflect these are test-controlled Some values—use
config.github_pat.clone().unwrap() (or .expect("github_pat must be set")) and
config.github_username.clone().unwrap() (or .expect("github_username must be
set")) so intent is clear; leave config.github_graphql_url.clone() unchanged.
src/controllers/api/v1/ytdlp_controller.rs (1)

64-72: Consider pre-allocating the response vector.

Per coding guidelines, pre-allocate vector capacities when the size is known.

♻️ Suggested optimization
     let jobs = state.ytdlp_manager.list_jobs();
-    let response_jobs: Vec<YtdlpJobResponse> =
-        jobs.into_iter().map(YtdlpJobResponse::from).collect();
+    let mut response_jobs = Vec::with_capacity(jobs.len());
+    response_jobs.extend(jobs.into_iter().map(YtdlpJobResponse::from));
     Ok((
         StatusCode::OK,
         Json(YtdlpListResponse {
             jobs: response_jobs,
         }),
     ))

As per coding guidelines: "Pre-allocate vector capacities when the size is known or estimable."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/controllers/api/v1/ytdlp_controller.rs` around lines 64 - 72, The
response vector is built from a known-size source
(state.ytdlp_manager.list_jobs()), so pre-allocate capacity to avoid
reallocations: obtain the length of jobs, create response_jobs with
Vec::with_capacity(jobs.len()), then iterate over jobs and push
YtdlpJobResponse::from(job) into response_jobs, and return YtdlpListResponse {
jobs: response_jobs } as before.
.env.example (1)

46-48: Note: Static analysis flagged formatting inconsistencies.

The dotenv-linter reports SpaceCharacter warnings (spaces around =) and UnorderedKey warnings (alphabetical ordering). These are stylistic and don't affect functionality, but you may want to address them for consistency if your project enforces .env linting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.env.example around lines 46 - 48, The .env.example has spacing around
equals and key order issues flagged by dotenv-linter; remove spaces around the
assignment operator for the TF_VAR entries (TF_VAR_DOWNLOAD_DIR,
TF_VAR_MASTER_API_KEY, TF_VAR_WARP_LICENSE_KEY) so they follow the project's
.env formatting (KEY="value" with no surrounding spaces) and reorder those keys
alphabetically to satisfy UnorderedKey warnings.
tests/layer_unit_tests.rs (1)

73-81: Test correctly validates from_env() error handling, but consider test isolation.

The test verifies that AppConfig::from_env() returns Err when MASTER_API_KEY is missing. However, std::env::remove_var affects the process-wide environment, which can cause flakiness when tests run in parallel.

Consider using #[serial] from the serial_test crate or running this test with -- --test-threads=1 to avoid interference with other tests that may depend on environment variables.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/layer_unit_tests.rs` around lines 73 - 81, The test
app_config_from_env_fails_when_master_api_key_missing manipulates process-wide
env via std::env::remove_var which can flake in parallel; make the test isolated
by either annotating the test with #[serial] (from the serial_test crate) or by
saving the original MASTER_API_KEY, removing it, calling AppConfig::from_env(),
asserting Err, and then restoring the original env value in a finally/teardown
fashion so AppConfig::from_env() behavior is validated without affecting other
tests.
src/config.rs (1)

6-12: Consider removing or using the InvalidValue variant.

ConfigError::InvalidValue is defined but not currently used in from_env(). If it's intended for future use, consider adding a TODO comment. Otherwise, it could be removed to avoid dead code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 6 - 12, ConfigError defines an unused variant
InvalidValue; either remove that variant from the enum or mark its intended
future use: if you plan to validate env values in from_env(), update from_env()
to return ConfigError::InvalidValue (with key and details) where appropriate,
otherwise delete InvalidValue or add a TODO comment above the variant
referencing from_env() to justify keeping it. Reference: enum ConfigError and
function from_env().
src/models/github_dto.rs (1)

1-47: Add Rustdoc comments to public DTO structs.

Per coding guidelines, all public-facing methods and services should have /// (Rustdoc) comments. Consider adding brief documentation explaining the purpose of these DTOs and their relationship to the GitHub GraphQL API.

📝 Example documentation
 use serde::Deserialize;
 
+/// Root response wrapper for GitHub GraphQL API queries.
 #[derive(Debug, Deserialize)]
 pub struct GithubGqlResponse {
     pub data: Option<GithubGraphQLUser>,
     pub errors: Option<Vec<serde_json::Value>>,
 }
 
+/// Wrapper for the `user` field in GraphQL response data.
 #[derive(Debug, Deserialize)]
 #[serde(rename_all = "camelCase")]
 pub struct GithubGraphQLUser {

As per coding guidelines: "All public-facing methods and services must have /// (Rustdoc) comments explaining intent and behavior."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/github_dto.rs` around lines 1 - 47, Add Rustdoc comments (///) to
each public DTO struct and key fields to explain their purpose and relation to
the GitHub GraphQL API: document GithubGqlResponse as the top-level GraphQL
response (including data and errors), document GithubGraphQLUser as the wrapper
for the user node, GithubUserNode as containing the contributions collection,
GithubContributionsCollection as the contributions container,
GithubContributionCalendar as the calendar with total_contributions and weeks,
GithubWeek as a weekly bucket of contribution_days, and GithubContributionDay as
a single-day record (date, weekday, contribution_count, contribution_level);
keep each comment one or two sentences describing intent and any JSON mapping
(camelCase) so readers know how these DTOs map to the GitHub GraphQL schema.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/error.rs`:
- Around line 81-85: Self::UpstreamError currently returns the raw upstream
message to clients (StatusCode::BAD_GATEWAY, msg.clone(),
Some("UPSTREAM_ERROR")), which can leak internals; change the response payload
to a sanitized client-facing string (e.g. "Upstream service error" or similar)
while preserving the original msg for internal logs/telemetry; update the branch
handling Self::UpstreamError so the tuple uses the sanitized message for the
second element and emit the original msg to the logger/tracing (e.g., via
error/debug) before returning, keeping the error code/tag (UPSTREAM_ERROR)
unchanged.

In `@src/services/contributions.rs`:
- Around line 106-108: Replace the suboptimal duration literal in the background
task spawn: in the tokio::spawn closure where you create let mut interval =
tokio::time::interval(Duration::from_secs(600)); use Duration::from_mins(10)
instead (i.e., tokio::time::interval(Duration::from_mins(10))). Update the use
site of Duration in that block so Clippy's clippy::duration-suboptimal-units
lint is satisfied while preserving the same interval semantics.

---

Nitpick comments:
In @.env.example:
- Around line 46-48: The .env.example has spacing around equals and key order
issues flagged by dotenv-linter; remove spaces around the assignment operator
for the TF_VAR entries (TF_VAR_DOWNLOAD_DIR, TF_VAR_MASTER_API_KEY,
TF_VAR_WARP_LICENSE_KEY) so they follow the project's .env formatting
(KEY="value" with no surrounding spaces) and reorder those keys alphabetically
to satisfy UnorderedKey warnings.

In `@src/config.rs`:
- Around line 6-12: ConfigError defines an unused variant InvalidValue; either
remove that variant from the enum or mark its intended future use: if you plan
to validate env values in from_env(), update from_env() to return
ConfigError::InvalidValue (with key and details) where appropriate, otherwise
delete InvalidValue or add a TODO comment above the variant referencing
from_env() to justify keeping it. Reference: enum ConfigError and function
from_env().

In `@src/controllers/api/v1/ytdlp_controller.rs`:
- Around line 64-72: The response vector is built from a known-size source
(state.ytdlp_manager.list_jobs()), so pre-allocate capacity to avoid
reallocations: obtain the length of jobs, create response_jobs with
Vec::with_capacity(jobs.len()), then iterate over jobs and push
YtdlpJobResponse::from(job) into response_jobs, and return YtdlpListResponse {
jobs: response_jobs } as before.

In `@src/middleware/cors.rs`:
- Around line 53-54: The matchers vector is grown dynamically but its final size
equals raw_origins.len(); change the initialization of matchers from let mut
matchers = Vec::new(); to pre-allocate capacity using
Vec::with_capacity(raw_origins.len()) so the loop that pushes into matchers
(referencing the matchers variable and raw_origins iterator) avoids
reallocations and follows the pre-allocation guideline.

In `@src/models/github_dto.rs`:
- Around line 1-47: Add Rustdoc comments (///) to each public DTO struct and key
fields to explain their purpose and relation to the GitHub GraphQL API: document
GithubGqlResponse as the top-level GraphQL response (including data and errors),
document GithubGraphQLUser as the wrapper for the user node, GithubUserNode as
containing the contributions collection, GithubContributionsCollection as the
contributions container, GithubContributionCalendar as the calendar with
total_contributions and weeks, GithubWeek as a weekly bucket of
contribution_days, and GithubContributionDay as a single-day record (date,
weekday, contribution_count, contribution_level); keep each comment one or two
sentences describing intent and any JSON mapping (camelCase) so readers know how
these DTOs map to the GitHub GraphQL schema.

In `@src/routes/api/v1/mod.rs`:
- Around line 9-12: Add a Rustdoc comment for the public constructor pub fn
router(state: AppState) -> Router<AppState> that explains its intent and
contract: describe what routes are exposed (it merges ytdlp_routes::router and
nests contributions_routes::router under "/api/v1/contributions"), document the
state parameter (AppState) and ownership/clone behavior, and state what the
returned Router<AppState> represents and any invariants (e.g., path prefixing or
required middleware). Ensure the comment starts with /// immediately above the
function signature.

In `@tests/api/contributions_tests.rs`:
- Around line 82-87: The test is using redundant fallbacks: replace
config.github_pat.clone().unwrap_or_default() and
config.github_username.clone().unwrap_or_else(|| "nxdun".to_string()) with
direct unwrapping to reflect these are test-controlled Some values—use
config.github_pat.clone().unwrap() (or .expect("github_pat must be set")) and
config.github_username.clone().unwrap() (or .expect("github_username must be
set")) so intent is clear; leave config.github_graphql_url.clone() unchanged.

In `@tests/api/validation_tests.rs`:
- Line 63: The current assertion only checks for the substring "messages";
instead assert the JSON structure explicitly by verifying body["message"] is an
object and contains a "messages" key, that body["message"]["messages"] is an
array (non-empty), and that elements have the expected fields (e.g., check the
first element has specific keys or types). Update the test in
validation_tests.rs to replace the contains(...) check with assertions using
body["message"].is_object(), body["message"]["messages"].is_array(), an assert
on messages array length > 0, and one or two asserts on the first element's
keys/types to validate payload shape.

In `@tests/layer_unit_tests.rs`:
- Around line 73-81: The test
app_config_from_env_fails_when_master_api_key_missing manipulates process-wide
env via std::env::remove_var which can flake in parallel; make the test isolated
by either annotating the test with #[serial] (from the serial_test crate) or by
saving the original MASTER_API_KEY, removing it, calling AppConfig::from_env(),
asserting Err, and then restoring the original env value in a finally/teardown
fashion so AppConfig::from_env() behavior is validated without affecting other
tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8c314c02-086c-403f-b14b-c28a3a7e082b

📥 Commits

Reviewing files that changed from the base of the PR and between 1bd33a7 and 1b3234f.

📒 Files selected for processing (32)
  • .env.example
  • GEMINI.md
  • src/app.rs
  • src/config.rs
  • src/controllers/api/v1/contributions_controller.rs
  • src/controllers/api/v1/ytdlp_controller.rs
  • src/controllers/health_controller.rs
  • src/controllers/validation_controller.rs
  • src/error.rs
  • src/extractors/validated_json.rs
  • src/middleware/api_key.rs
  • src/middleware/auth.rs
  • src/middleware/captcha.rs
  • src/middleware/cors.rs
  • src/middleware/mod.rs
  • src/middleware/rate_limit.rs
  • src/models/contributions.rs
  • src/models/github_dto.rs
  • src/models/health.rs
  • src/models/mod.rs
  • src/models/validation.rs
  • src/models/ytdlp.rs
  • src/models/ytdlp_dto.rs
  • src/routes/api/v1/contributions_routes.rs
  • src/routes/api/v1/mod.rs
  • src/services/contributions.rs
  • src/services/ytdlp/mod.rs
  • tests/api/common.rs
  • tests/api/contributions_tests.rs
  • tests/api/validation_tests.rs
  • tests/api/ytdlp_tests.rs
  • tests/layer_unit_tests.rs

Comment thread src/error.rs Outdated
Comment thread src/services/contributions.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/config.rs (2)

39-56: ⚠️ Potential issue | 🟠 Major

Redact secrets from AppConfig's debug output.

#[derive(Debug)] will print master_api_key, captcha_secret_key, and github_pat if AppConfig is ever logged. Making the field private doesn't close that leak path.

🔒 Suggested fix
-#[derive(Debug, Clone)]
+#[derive(Clone)]
 pub struct AppConfig {
impl std::fmt::Debug for AppConfig {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        f.debug_struct("AppConfig")
            .field("name", &self.name)
            .field("env", &self.env)
            .field("host", &self.host)
            .field("port", &self.port)
            .field("allowed_origins", &self.allowed_origins)
            .field("download_dir", &self.download_dir)
            .field("ytdlp_path", &self.ytdlp_path)
            .field("ytdlp_external_downloader", &self.ytdlp_external_downloader)
            .field(
                "ytdlp_external_downloader_args",
                &self.ytdlp_external_downloader_args,
            )
            .field("max_concurrent_downloads", &self.max_concurrent_downloads)
            .field("captcha_secret_key", &self.captcha_secret_key.as_ref().map(|_| "***"))
            .field("master_api_key", &"***")
            .field("github_pat", &self.github_pat.as_ref().map(|_| "***"))
            .field("github_username", &self.github_username)
            .field("github_graphql_url", &self.github_graphql_url)
            .finish()
    }
}
As per coding guidelines `src/config.rs`: Config fields should be private where appropriate, using constructors and public methods (e.g., `check_api_key`) to enforce security policies.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 39 - 56, AppConfig currently derives Debug and
exposes secrets (captcha_secret_key, github_pat and master_api_key) in logs;
make secret fields private (change pub captcha_secret_key: Option<String> and
pub github_pat: Option<String> to non-pub), implement a custom std::fmt::Debug
for AppConfig that redacts those fields (e.g., show "***" or map to
Option::map(|_| "***") for optionals) and keep non-secret fields printed as
before, and update the constructor/new functions and any callers to use provided
public accessors (e.g., check_api_key) or explicit getters instead of direct
field access to preserve encapsulation.

14-28: ⚠️ Potential issue | 🟠 Major

Malformed env vars still fall back to defaults.

env_or() logs and substitutes the default value, so from_env() never returns ConfigError::InvalidValue for bad APP_PORT, MAX_CONCURRENT_DOWNLOADS, etc. That masks broken deploys and undercuts the hardening this refactor is aiming for.

🛠️ Suggested shape
-fn env_or<T: std::str::FromStr>(key: &str, default: &str) -> T {
-    env::var(key)
-        .unwrap_or_else(|_| default.to_string())
-        .parse::<T>()
-        .unwrap_or_else(|_| {
-            tracing::error!("{} must be a valid {}", key, std::any::type_name::<T>());
-            #[allow(clippy::expect_used)]
-            default
-                .to_string()
-                .parse::<T>()
-                .ok()
-                .expect("default must be valid")
-        })
+fn env_or<T: std::str::FromStr>(key: &str, default: &str) -> Result<T, ConfigError> {
+    let raw = env::var(key).unwrap_or_else(|_| default.to_string());
+    raw.parse::<T>().map_err(|_| ConfigError::InvalidValue {
+        key: key.to_string(),
+        details: format!("expected {}", std::any::type_name::<T>()),
+    })
 }
-            env_or("APP_PORT", "8080"),
+            env_or("APP_PORT", "8080")?,
As per coding guidelines `src/config.rs`: Validation logic should be centralized in `AppConfig`, and `AppConfig::from_env()` must return a `Result` rather than panicking.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` around lines 14 - 28, env_or currently swallows parse errors
and silently falls back to defaults, so AppConfig::from_env() never returns
ConfigError::InvalidValue; change env_or<T> to return Result<T, ConfigError> (or
Result<T, String>) and propagate parse errors up, then update
AppConfig::from_env() to return Result<Self, ConfigError> and call the new
env_or for each field, mapping parse/env errors to ConfigError::InvalidValue
(including the key and error) instead of substituting defaults; ensure any
places that constructed AppConfig now handle the Result and that defaults are
applied only where intended (e.g., use Option or explicit default-merge logic
inside AppConfig after validation).
src/services/contributions.rs (1)

161-193: ⚠️ Potential issue | 🟠 Major

Keep cache_ttl_seconds synchronized with the actual entry expiry.

The hot-cache path and stale-cache fallback clone stored responses without recalculating meta.cache_ttl_seconds, while the fresh path still seeds the struct with a hardcoded 10800. Near midnight—or when serving stale-on-error—the API can report a TTL that's already shorter, or already expired.

🛠️ Suggested fix
         if let Some(entry) = self.cache.get(&cache_key) {
             let (cached_resp, expires_at) = entry.value();
             if *expires_at > now {
                 let mut resp = cached_resp.clone();
                 resp.meta.cached = true;
+                resp.meta.cache_ttl_seconds =
+                    u32::try_from(expires_at.saturating_sub(now)).unwrap_or(u32::MAX);
                 return Ok(resp);
             }
         }
@@
-            Ok(new_resp) => {
+            Ok(mut new_resp) => {
                 // Determine TTL: 3 hours OR seconds until UTC midnight, whichever is sooner.
                 let seconds_since_midnight = now % 86400;
                 let seconds_until_midnight = 86400 - seconds_since_midnight;
                 let ttl = CACHE_TTL_SECONDS.min(seconds_until_midnight);
+                new_resp.meta.cache_ttl_seconds =
+                    u32::try_from(ttl).unwrap_or(u32::MAX);
 
                 let expires_at = now + ttl;
@@
             Err(e) => {
                 // Fallback to stale cache on upstream failure
                 if let Some(entry) = self.cache.get(&cache_key) {
                     let mut resp = entry.value().0.clone();
                     resp.meta.cached = true;
+                    resp.meta.cache_ttl_seconds = 0;
                     return Ok(resp);
                 }
                 Err(e)
             }

Also applies to: 367-370

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/contributions.rs` around lines 161 - 193, When returning cached
or stale responses in the hot-path and the stale-on-error fallback, update
resp.meta.cache_ttl_seconds to reflect the remaining seconds until the stored
entry actually expires rather than leaving the hardcoded 10800; compute
remaining_ttl = entry.expires_at.saturating_sub(current_now) (or clamp to 0) and
set resp.meta.cache_ttl_seconds = remaining_ttl as an i64 (mirroring how TTL is
computed in the fresh-path using CACHE_TTL_SECONDS and seconds_until_midnight)
before returning resp; apply the same change for the similar logic around lines
367-370 so any cloned response always reports the true remaining TTL stored with
cache_key and the value inserted by self.cache.insert after fetch_and_process.
src/services/ytdlp/mod.rs (1)

61-62: ⚠️ Potential issue | 🔴 Critical

Fix: Duration::from_mins() does not exist in stable Rust.

The code at line 62 uses tokio::time::Duration::from_mins(10), but from_mins() is not available in stable Rust. It is part of an unstable duration_constructors_lite feature. Replace with Duration::from_secs(600):

Suggested fix
-            let mut interval = tokio::time::interval(tokio::time::Duration::from_mins(10));
+            let mut interval = tokio::time::interval(tokio::time::Duration::from_secs(600));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/ytdlp/mod.rs` around lines 61 - 62, The code uses the unstable
Duration::from_mins in the tokio::spawn block when creating the interval;
replace the call to tokio::time::Duration::from_mins(10) with
tokio::time::Duration::from_secs(600) (i.e., change the Duration constructor
used in the tokio::time::interval creation near tokio::spawn and the variable
interval) so the code compiles on stable Rust.
♻️ Duplicate comments (1)
src/error.rs (1)

81-85: ⚠️ Potential issue | 🟠 Major

Do not return raw upstream error text to clients.

This currently exposes internal upstream details in API responses. Keep the original error in logs, but return a sanitized client message.

Suggested patch
-            Self::UpstreamError(msg) => (
-                StatusCode::BAD_GATEWAY,
-                msg.clone(),
-                Some("UPSTREAM_ERROR".to_string()),
-            ),
+            Self::UpstreamError(msg) => {
+                error!("Upstream error occurred: {msg}");
+                (
+                    StatusCode::BAD_GATEWAY,
+                    "Upstream service temporarily unavailable".to_string(),
+                    Some("UPSTREAM_ERROR".to_string()),
+                )
+            },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/error.rs` around lines 81 - 85, The match arm for Self::UpstreamError in
src/error.rs currently returns the raw upstream message to clients
(StatusCode::BAD_GATEWAY, msg.clone(), Some("UPSTREAM_ERROR")), so change it to
return a sanitized client-facing message (e.g., "An upstream service error
occurred") while keeping the error kind "UPSTREAM_ERROR"; log the original msg
to your application logs instead of sending it to the client (use your existing
logging facility where this mapping occurs or just add a logging call before the
tuple is returned), so preserve the original msg for diagnostics but never
include it in the tuple returned to clients.
🧹 Nitpick comments (9)
src/routes/api/v1/contributions_routes.rs (1)

5-5: Add Rustdoc to the public router function.

Please document this public API entrypoint with /// so intent/usage stays explicit.

Suggested patch
+/// Builds API v1 contributions routes.
 pub fn router(_state: AppState) -> Router<AppState> {
     Router::new().route("/", get(get_contributions))
 }
As per coding guidelines "`**/*.rs`: All public-facing methods and services must have `///` (Rustdoc) comments explaining intent and behavior".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/api/v1/contributions_routes.rs` at line 5, The public function
router lacks Rustdoc; add a triple-slash doc comment above pub fn router(_state:
AppState) -> Router<AppState> that briefly describes the function's purpose
(registering contribution-related routes), explains the parameter (AppState -
shared application state/context) and what is returned (Router<AppState>
containing the configured routes), and any noteworthy behavior or expectations
(e.g., that it wires handlers/middleware but does not start the server).
src/extractors/validated_json.rs (1)

45-46: Preserve observability on serialization fallback.

When serialization fails, log the error before returning the generic fallback message so debugging signal is not lost.

Suggested patch
-            let msg = serde_json::to_string(&error_map)
-                .unwrap_or_else(|_| "Validation failed".to_string());
+            let msg = serde_json::to_string(&error_map).unwrap_or_else(|err| {
+                tracing::error!("Failed to serialize validation errors: {err}");
+                "Validation failed".to_string()
+            });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/extractors/validated_json.rs` around lines 45 - 46, The
serde_json::to_string(&error_map).unwrap_or_else(...) currently discards the
serialization error; change the unwrap_or_else closure to accept the error, log
it (e.g., via tracing::error! or the project's logger) with context like "Failed
to serialize error_map" including the error details, and then return the
existing "Validation failed" fallback string so observability is preserved for
msg creation in validated_json.rs.
tests/api/common.rs (2)

40-56: Same positional parameter concern as other test file.

This constructor call mirrors the one in layer_unit_tests.rs. The same suggestion applies: consider a shared test helper or builder to reduce duplication and parameter ordering mistakes across test files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/common.rs` around lines 40 - 56, The repeated long positional call
to AppConfig::new in tests (e.g., the call in tests/api/common.rs) is fragile
and duplicated; replace it with a shared test helper or builder (e.g., create a
helper function like test_app_config() or a TestAppConfigBuilder used by tests
in tests/api/common.rs and layer_unit_tests.rs) that constructs and returns
Arc<AppConfig> with sensible defaults and optional overrides, and update tests
to call that helper instead of duplicating the positional parameters to reduce
ordering errors and duplication.

23-23: Consider importing the constant from middleware for consistency.

The local API_KEY_HEADER constant duplicates the value of X_API_KEY from src/middleware/mod.rs. While correct, importing and converting the HeaderName constant could reduce drift risk if the header name ever changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/api/common.rs` at line 23, The tests define a duplicated API_KEY_HEADER
constant; instead import the canonical HeaderName constant X_API_KEY and use its
string form where tests need the header name. Remove or stop using the local
API_KEY_HEADER and replace usages with the imported X_API_KEY converted to a
string (e.g., via as_str()/to_string() as appropriate) so the tests always
reflect the single source of truth (reference symbols: API_KEY_HEADER and
X_API_KEY).
src/models/contributions.rs (1)

4-13: Missing Rustdoc comments on public structs.

Per coding guidelines, all public-facing methods and services must have /// (Rustdoc) comments explaining intent and behavior. These model structs are serialized directly to API responses and would benefit from documentation.

Also applies to: 15-20, 22-30, 32-39, 41-46, 48-63, 65-75

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/contributions.rs` around lines 4 - 13, Add Rustdoc comments (///)
to all public model structs that are serialized to API responses so their
purpose and serialized behavior are documented; for example add a brief doc
above ContributionsResponse explaining it represents the full response returned
for a user's contributions and describe any non-obvious fields, and do the same
for ContributionRange, ContributionSummary, ContributionLegend,
ContributionMonth, ContributionCell, ContributionMeta and any other public
structs in this file—ensure each public struct has a one- or two-sentence
summary and note any important field semantics (e.g., units, date formats,
optionality).
tests/layer_unit_tests.rs (1)

10-26: Constructor with many positional parameters is error-prone.

The AppConfig::new() call has 16 unlabeled positional arguments, making it easy to misorder them. Consider using a builder pattern or a test-specific factory function that sets sensible defaults, reducing the surface for mistakes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/layer_unit_tests.rs` around lines 10 - 26, The test calls
AppConfig::new with 16 unlabeled positional args which is fragile; replace this
with a clearer construction using either a builder (e.g., implement
AppConfig::builder() returning a Builder with chainable setters like env(),
host(), port(), downloads_dir(), tool_name(), retries(), master_key(),
github_graphql_url()) or add a test-only factory such as
TestAppConfig::default_with_env(env) that fills sensible defaults and exposes
setters for the few overrides; update the test in tests/layer_unit_tests.rs to
use the builder or factory instead of AppConfig::new so parameters are named and
order-sensitive mistakes are avoided (reference symbols: AppConfig::new,
AppConfig::builder, TestAppConfig::default_with_env).
src/middleware/mod.rs (1)

14-22: Remove misleading recommendation about subtle crate; current implementation is appropriate for API key comparison.

The early return on length does leak whether lengths match, but this is an acceptable trade-off for API keys of known/fixed length. The subtle crate's ConstantTimeEq trait does not provide direct constant-time string comparison—it only works on primitives and byte slices, requiring .as_bytes() conversion anyway. The current implementation is well-documented and sufficient for this use case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/middleware/mod.rs` around lines 14 - 22, Remove the misleading suggestion
to use the `subtle` crate and keep the current implementation of
constant_time_eq as-is; update or remove any comments/docs that recommend
`subtle` and instead document on the constant_time_eq function that the early
length check is an intentional and acceptable trade-off for fixed-length API
keys and that bytes() zipper fold provides the needed constant-time comparison
for equal-length inputs.
src/models/github_dto.rs (1)

4-7: Type the GraphQL error and enum fields in the DTO layer.

errors: Option<Vec<serde_json::Value>> and contribution_level: String let upstream schema drift leak past the anti-corruption layer. A small error DTO plus a GithubContributionLevel enum would make changes fail at deserialization time instead of later in service logic.

As per coding guidelines **/*_dto.rs: External DTOs must use *_dto.rs naming and strictly mirror external API response schemas (e.g., camelCase for GitHub/YouTube).

Also applies to: 42-46

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/github_dto.rs` around lines 4 - 7, Replace the loose types in the
external DTOs: define a small error DTO (e.g., GithubErrorDto) and use
Option<Vec<GithubErrorDto>> for the errors field in GithubGqlResponse (replace
serde_json::Value), and introduce a GithubContributionLevel enum and use it for
any contribution_level fields (instead of String) in the DTOs (e.g., in
GithubGraphQLUser/GithubContribution DTOs). Ensure the new DTO and enum are
named in this *_dto.rs file, preserve the external schema's exact field
names/casing for serde deserialization, and update serde derives/attributes so
deserialization fails on unexpected enum variants or error shapes rather than
letting schema drift leak into service logic.
src/models/ytdlp_dto.rs (1)

17-20: Prefer a flat enqueue response shape.

Wrapping the job payload under job makes this endpoint inconsistent with the flat get_download_job response and adds a wrapper the repo usually avoids unless it's contractually required.

As per coding guidelines src/{controllers,models}/**/*.rs: Responses should be flat and idiomatic where possible, avoiding unnecessary wrapper fields like 'job' or 'data' unless required by specific API design.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/ytdlp_dto.rs` around lines 17 - 20, The response struct
YtdlpEnqueueResponse currently wraps the job under job: YtdlpJobResponse which
creates an unnecessary nested payload; make the response flat by either
replacing the job field with #[serde(flatten)] pub job: YtdlpJobResponse so its
fields serialize at the top level, or expand the fields of YtdlpJobResponse
directly into YtdlpEnqueueResponse and update any constructors/creators that
build YtdlpEnqueueResponse (wherever you construct it) to populate the flattened
fields instead of nesting under job.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app.rs`:
- Around line 50-56: Replace the hard process exit in the AppConfig load path by
propagating the error: change the containing function (e.g., run) to return a
Result (for example Result<(), anyhow::Error> or appropriate error type), remove
std::process::exit(1) in the Err branch of AppConfig::from_env(), and return
Err(err) (or map the error into the chosen Result type) instead; update the call
sites of run() to handle the Result. Ensure the code still constructs
Arc::new(cfg) on Ok(cfg) and only uses the Err branch to return the error rather
than exiting the process.

In `@src/controllers/api/v1/ytdlp_controller.rs`:
- Around line 117-123: The current code uses
serde_json::to_string(&job_resp).unwrap_or_default(), which hides serialization
errors by sending an empty progress payload; change this so you call
serde_json::to_string(&job_resp) and match the Result: on Ok(snapshot_str)
compare and update last_snapshot (clone_from) and send the progress event via
tx.send(Ok(Event::default().event("progress").data(snapshot_str))); on Err(err)
do not send an empty progress event—either skip this tick or send an explicit
error event (e.g.,
tx.send(Ok(Event::default().event("error").data(err.to_string())))) so consumers
see the failure; ensure last_snapshot is only updated on successful
serialization.

In `@src/services/contributions.rs`:
- Around line 105-107: The background task uses Duration::from_mins(10) (inside
the tokio::spawn block) which requires Rust 1.91; either add rust-version =
"1.91" to the [package] section of Cargo.toml to document this requirement, or
replace Duration::from_mins(10) with Duration::from_secs(600) to retain
compatibility with older compilers; update the code where
Duration::from_mins(10) is used (the tokio::spawn interval creation) or update
Cargo.toml accordingly and run a quick cargo build to confirm the chosen change
compiles.

In `@tests/api/validation_tests.rs`:
- Line 63: Replace the brittle substring assertion on
body["message"].as_str().unwrap() with parsing that string as JSON (use
serde_json::from_str on body["message"].as_str().unwrap()), then assert the
parsed value is an object and contains the expected key(s) (e.g.,
parsed.get("messages").is_some() and that parsed["messages"].is_array() with the
expected shape/length if applicable); update the assertion in
tests/api/validation_tests.rs to validate structure/keys rather than using
contains("messages").

---

Outside diff comments:
In `@src/config.rs`:
- Around line 39-56: AppConfig currently derives Debug and exposes secrets
(captcha_secret_key, github_pat and master_api_key) in logs; make secret fields
private (change pub captcha_secret_key: Option<String> and pub github_pat:
Option<String> to non-pub), implement a custom std::fmt::Debug for AppConfig
that redacts those fields (e.g., show "***" or map to Option::map(|_| "***") for
optionals) and keep non-secret fields printed as before, and update the
constructor/new functions and any callers to use provided public accessors
(e.g., check_api_key) or explicit getters instead of direct field access to
preserve encapsulation.
- Around line 14-28: env_or currently swallows parse errors and silently falls
back to defaults, so AppConfig::from_env() never returns
ConfigError::InvalidValue; change env_or<T> to return Result<T, ConfigError> (or
Result<T, String>) and propagate parse errors up, then update
AppConfig::from_env() to return Result<Self, ConfigError> and call the new
env_or for each field, mapping parse/env errors to ConfigError::InvalidValue
(including the key and error) instead of substituting defaults; ensure any
places that constructed AppConfig now handle the Result and that defaults are
applied only where intended (e.g., use Option or explicit default-merge logic
inside AppConfig after validation).

In `@src/services/contributions.rs`:
- Around line 161-193: When returning cached or stale responses in the hot-path
and the stale-on-error fallback, update resp.meta.cache_ttl_seconds to reflect
the remaining seconds until the stored entry actually expires rather than
leaving the hardcoded 10800; compute remaining_ttl =
entry.expires_at.saturating_sub(current_now) (or clamp to 0) and set
resp.meta.cache_ttl_seconds = remaining_ttl as an i64 (mirroring how TTL is
computed in the fresh-path using CACHE_TTL_SECONDS and seconds_until_midnight)
before returning resp; apply the same change for the similar logic around lines
367-370 so any cloned response always reports the true remaining TTL stored with
cache_key and the value inserted by self.cache.insert after fetch_and_process.

In `@src/services/ytdlp/mod.rs`:
- Around line 61-62: The code uses the unstable Duration::from_mins in the
tokio::spawn block when creating the interval; replace the call to
tokio::time::Duration::from_mins(10) with tokio::time::Duration::from_secs(600)
(i.e., change the Duration constructor used in the tokio::time::interval
creation near tokio::spawn and the variable interval) so the code compiles on
stable Rust.

---

Duplicate comments:
In `@src/error.rs`:
- Around line 81-85: The match arm for Self::UpstreamError in src/error.rs
currently returns the raw upstream message to clients (StatusCode::BAD_GATEWAY,
msg.clone(), Some("UPSTREAM_ERROR")), so change it to return a sanitized
client-facing message (e.g., "An upstream service error occurred") while keeping
the error kind "UPSTREAM_ERROR"; log the original msg to your application logs
instead of sending it to the client (use your existing logging facility where
this mapping occurs or just add a logging call before the tuple is returned), so
preserve the original msg for diagnostics but never include it in the tuple
returned to clients.

---

Nitpick comments:
In `@src/extractors/validated_json.rs`:
- Around line 45-46: The serde_json::to_string(&error_map).unwrap_or_else(...)
currently discards the serialization error; change the unwrap_or_else closure to
accept the error, log it (e.g., via tracing::error! or the project's logger)
with context like "Failed to serialize error_map" including the error details,
and then return the existing "Validation failed" fallback string so
observability is preserved for msg creation in validated_json.rs.

In `@src/middleware/mod.rs`:
- Around line 14-22: Remove the misleading suggestion to use the `subtle` crate
and keep the current implementation of constant_time_eq as-is; update or remove
any comments/docs that recommend `subtle` and instead document on the
constant_time_eq function that the early length check is an intentional and
acceptable trade-off for fixed-length API keys and that bytes() zipper fold
provides the needed constant-time comparison for equal-length inputs.

In `@src/models/contributions.rs`:
- Around line 4-13: Add Rustdoc comments (///) to all public model structs that
are serialized to API responses so their purpose and serialized behavior are
documented; for example add a brief doc above ContributionsResponse explaining
it represents the full response returned for a user's contributions and describe
any non-obvious fields, and do the same for ContributionRange,
ContributionSummary, ContributionLegend, ContributionMonth, ContributionCell,
ContributionMeta and any other public structs in this file—ensure each public
struct has a one- or two-sentence summary and note any important field semantics
(e.g., units, date formats, optionality).

In `@src/models/github_dto.rs`:
- Around line 4-7: Replace the loose types in the external DTOs: define a small
error DTO (e.g., GithubErrorDto) and use Option<Vec<GithubErrorDto>> for the
errors field in GithubGqlResponse (replace serde_json::Value), and introduce a
GithubContributionLevel enum and use it for any contribution_level fields
(instead of String) in the DTOs (e.g., in GithubGraphQLUser/GithubContribution
DTOs). Ensure the new DTO and enum are named in this *_dto.rs file, preserve the
external schema's exact field names/casing for serde deserialization, and update
serde derives/attributes so deserialization fails on unexpected enum variants or
error shapes rather than letting schema drift leak into service logic.

In `@src/models/ytdlp_dto.rs`:
- Around line 17-20: The response struct YtdlpEnqueueResponse currently wraps
the job under job: YtdlpJobResponse which creates an unnecessary nested payload;
make the response flat by either replacing the job field with #[serde(flatten)]
pub job: YtdlpJobResponse so its fields serialize at the top level, or expand
the fields of YtdlpJobResponse directly into YtdlpEnqueueResponse and update any
constructors/creators that build YtdlpEnqueueResponse (wherever you construct
it) to populate the flattened fields instead of nesting under job.

In `@src/routes/api/v1/contributions_routes.rs`:
- Line 5: The public function router lacks Rustdoc; add a triple-slash doc
comment above pub fn router(_state: AppState) -> Router<AppState> that briefly
describes the function's purpose (registering contribution-related routes),
explains the parameter (AppState - shared application state/context) and what is
returned (Router<AppState> containing the configured routes), and any noteworthy
behavior or expectations (e.g., that it wires handlers/middleware but does not
start the server).

In `@tests/api/common.rs`:
- Around line 40-56: The repeated long positional call to AppConfig::new in
tests (e.g., the call in tests/api/common.rs) is fragile and duplicated; replace
it with a shared test helper or builder (e.g., create a helper function like
test_app_config() or a TestAppConfigBuilder used by tests in tests/api/common.rs
and layer_unit_tests.rs) that constructs and returns Arc<AppConfig> with
sensible defaults and optional overrides, and update tests to call that helper
instead of duplicating the positional parameters to reduce ordering errors and
duplication.
- Line 23: The tests define a duplicated API_KEY_HEADER constant; instead import
the canonical HeaderName constant X_API_KEY and use its string form where tests
need the header name. Remove or stop using the local API_KEY_HEADER and replace
usages with the imported X_API_KEY converted to a string (e.g., via
as_str()/to_string() as appropriate) so the tests always reflect the single
source of truth (reference symbols: API_KEY_HEADER and X_API_KEY).

In `@tests/layer_unit_tests.rs`:
- Around line 10-26: The test calls AppConfig::new with 16 unlabeled positional
args which is fragile; replace this with a clearer construction using either a
builder (e.g., implement AppConfig::builder() returning a Builder with chainable
setters like env(), host(), port(), downloads_dir(), tool_name(), retries(),
master_key(), github_graphql_url()) or add a test-only factory such as
TestAppConfig::default_with_env(env) that fills sensible defaults and exposes
setters for the few overrides; update the test in tests/layer_unit_tests.rs to
use the builder or factory instead of AppConfig::new so parameters are named and
order-sensitive mistakes are avoided (reference symbols: AppConfig::new,
AppConfig::builder, TestAppConfig::default_with_env).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 24bc5e1a-2e2e-41f3-a9c6-c9d90d11b557

📥 Commits

Reviewing files that changed from the base of the PR and between 1bd33a7 and 7a75f1d.

📒 Files selected for processing (32)
  • .env.example
  • GEMINI.md
  • src/app.rs
  • src/config.rs
  • src/controllers/api/v1/contributions_controller.rs
  • src/controllers/api/v1/ytdlp_controller.rs
  • src/controllers/health_controller.rs
  • src/controllers/validation_controller.rs
  • src/error.rs
  • src/extractors/validated_json.rs
  • src/middleware/api_key.rs
  • src/middleware/auth.rs
  • src/middleware/captcha.rs
  • src/middleware/cors.rs
  • src/middleware/mod.rs
  • src/middleware/rate_limit.rs
  • src/models/contributions.rs
  • src/models/github_dto.rs
  • src/models/health.rs
  • src/models/mod.rs
  • src/models/validation.rs
  • src/models/ytdlp.rs
  • src/models/ytdlp_dto.rs
  • src/routes/api/v1/contributions_routes.rs
  • src/routes/api/v1/mod.rs
  • src/services/contributions.rs
  • src/services/ytdlp/mod.rs
  • tests/api/common.rs
  • tests/api/contributions_tests.rs
  • tests/api/validation_tests.rs
  • tests/api/ytdlp_tests.rs
  • tests/layer_unit_tests.rs

Comment thread src/app.rs
Comment thread src/controllers/api/v1/ytdlp_controller.rs
Comment thread src/services/contributions.rs
Comment thread tests/api/validation_tests.rs Outdated
@nxdun
Copy link
Copy Markdown
Owner Author

nxdun commented Apr 26, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

@nxdun Sure! I'll kick off a full review of PR #12 right away.

✅ Actions performed

Full review triggered.

coderabbitai[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/app.rs (1)

50-56: ⚠️ Potential issue | 🟠 Major

Return the startup failure instead of exiting here.

Line 54 still hard-terminates from inside run(), which prevents the caller from deciding how to report or test startup errors. Returning a Result from run() would also let the bind failure path surface a real failure instead of only logging and returning.

As per coding guidelines: src/app.rs: Handle startup failures gracefully in app.rs rather than deep in application logic

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app.rs` around lines 50 - 56, The code in run() currently exits the
process on AppConfig::from_env failure (using std::process::exit) which prevents
callers from handling startup errors; change run() to return a Result and
propagate the configuration error instead of terminating: replace the match that
calls std::process::exit with returning Err(err) (or mapping the error into your
crate's startup error type) and adjust the function signature of run() to return
Result<..., E>, likewise update any other startup fail paths (e.g., bind
failures) to return errors so callers can handle/log/test them rather than the
application calling std::process::exit inside run().
.env.example (1)

4-48: ⚠️ Potential issue | 🟡 Minor

Reorder the added keys to satisfy dotenv-linter.

The new blocks still trigger UnorderedKey warnings throughout the file, so this change keeps .env.example noisy in CI and local lint runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.env.example around lines 4 - 48, The .env.example file is triggering
dotenv-linter UnorderedKey warnings; reorder the environment variable lines so
keys are in the expected order (alphabetical within each section) — specifically
place APP_NAME, APP_ENV, APP_HOST, APP_PORT, ALLOWED_ORIGINS,
CAPTCHA_SECRET_KEY, DOWNLOAD_DIR, GITHUB_PAT, GITHUB_USERNAME, MASTER_API_KEY,
MAX_CONCURRENT_DOWNLOADS, RUST_LOG, WARP_LICENSE_KEY, YTDLP_PATH,
YTDLP_EXTERNAL_DOWNLOADER, YTDLP_EXTERNAL_DOWNLOADER_ARGS and the Terraform
TF_VAR_* variables into proper sorted order (and sort the AWS_ and TF_VAR_
blocks alphabetically) so dotenv-linter no longer reports UnorderedKey for keys
like APP_NAME, ALLOWED_ORIGINS, RUST_LOG, YTDLP_PATH and TF_VAR_APP_ENV.
tests/layer_unit_tests.rs (1)

72-112: ⚠️ Potential issue | 🟠 Major

This env-mutation test is still unsafe under parallel execution.

Lines 85-86 and 98-100 mutate the real process environment, and the local mutex only protects code that also opts into ENV_LOCK. Other tests can still call AppConfig::from_env() or std::env::var() concurrently, which is exactly why these APIs are unsafe in recent Rust. Please move this check back into a subprocess-based test, or refactor config loading so the test can inject an env source without touching process-global state.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/layer_unit_tests.rs` around lines 72 - 112, The test
app_config_from_env_fails_when_master_api_key_missing mutates process-global env
via EnvGuard/ENV_LOCK and is unsafe under parallel tests; change it to run
AppConfig::from_env inside a subprocess instead of changing the parent process
env. Replace the EnvGuard usage in the test with a std::process::Command that
launches the test binary (or a small test helper binary) with MASTER_API_KEY
removed via Command.env_remove("MASTER_API_KEY") and asserts the child exits
with an error (or prints a failure marker) after calling AppConfig::from_env;
keep the test name app_config_from_env_fails_when_master_api_key_missing and
remove modifications to std::env::var/EnvGuard/ENV_LOCK so no global env
mutation happens.
🧹 Nitpick comments (1)
src/config.rs (1)

4-4: Move constant_time_eq out of middleware.

AppConfig is a low-level config type, so importing a helper from crate::middleware inverts the dependency direction this PR is trying to tighten. A small shared security/util module would keep config independent of the HTTP layer.

Also applies to: 155-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config.rs` at line 4, AppConfig currently imports the helper
constant_time_eq from crate::middleware, which inverts dependencies; move
constant_time_eq into a shared module (e.g., crate::security or crate::util) and
change the import in src/config.rs to use that new module instead of
crate::middleware; update any other places (including the code around the other
occurrences referenced near lines 155-159) to import constant_time_eq from the
new shared module so that AppConfig and other low-level types no longer depend
on the HTTP middleware layer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.env.example:
- Line 20: The .env.example currently lists WARP_LICENSE_KEY as an application
runtime variable but AppConfig / AppConfig::from_env() do not read it; move the
WARP_LICENSE_KEY entry out of the runtime section into the infra/deployment-only
section (or add a "deployment-only" label) so the example does not imply it is
required at runtime; update the variable comment to indicate it's only used by
deployment/infra (not read by AppConfig or AppConfig::from_env()) and keep the
exact variable name WARP_LICENSE_KEY unchanged.

In `@src/config.rs`:
- Around line 30-31: Update the doc comment for the helper function env_parse to
mark the enum variant as inline code by wrapping ConfigError::InvalidValue in
backticks (e.g., `ConfigError::InvalidValue`) so Clippy's doc_markdown warning
is resolved; locate the comment immediately above fn env_parse<T:
std::str::FromStr>(...) and modify only the text to use backticks around the
variant name.

---

Duplicate comments:
In @.env.example:
- Around line 4-48: The .env.example file is triggering dotenv-linter
UnorderedKey warnings; reorder the environment variable lines so keys are in the
expected order (alphabetical within each section) — specifically place APP_NAME,
APP_ENV, APP_HOST, APP_PORT, ALLOWED_ORIGINS, CAPTCHA_SECRET_KEY, DOWNLOAD_DIR,
GITHUB_PAT, GITHUB_USERNAME, MASTER_API_KEY, MAX_CONCURRENT_DOWNLOADS, RUST_LOG,
WARP_LICENSE_KEY, YTDLP_PATH, YTDLP_EXTERNAL_DOWNLOADER,
YTDLP_EXTERNAL_DOWNLOADER_ARGS and the Terraform TF_VAR_* variables into proper
sorted order (and sort the AWS_ and TF_VAR_ blocks alphabetically) so
dotenv-linter no longer reports UnorderedKey for keys like APP_NAME,
ALLOWED_ORIGINS, RUST_LOG, YTDLP_PATH and TF_VAR_APP_ENV.

In `@src/app.rs`:
- Around line 50-56: The code in run() currently exits the process on
AppConfig::from_env failure (using std::process::exit) which prevents callers
from handling startup errors; change run() to return a Result and propagate the
configuration error instead of terminating: replace the match that calls
std::process::exit with returning Err(err) (or mapping the error into your
crate's startup error type) and adjust the function signature of run() to return
Result<..., E>, likewise update any other startup fail paths (e.g., bind
failures) to return errors so callers can handle/log/test them rather than the
application calling std::process::exit inside run().

In `@tests/layer_unit_tests.rs`:
- Around line 72-112: The test
app_config_from_env_fails_when_master_api_key_missing mutates process-global env
via EnvGuard/ENV_LOCK and is unsafe under parallel tests; change it to run
AppConfig::from_env inside a subprocess instead of changing the parent process
env. Replace the EnvGuard usage in the test with a std::process::Command that
launches the test binary (or a small test helper binary) with MASTER_API_KEY
removed via Command.env_remove("MASTER_API_KEY") and asserts the child exits
with an error (or prints a failure marker) after calling AppConfig::from_env;
keep the test name app_config_from_env_fails_when_master_api_key_missing and
remove modifications to std::env::var/EnvGuard/ENV_LOCK so no global env
mutation happens.

---

Nitpick comments:
In `@src/config.rs`:
- Line 4: AppConfig currently imports the helper constant_time_eq from
crate::middleware, which inverts dependencies; move constant_time_eq into a
shared module (e.g., crate::security or crate::util) and change the import in
src/config.rs to use that new module instead of crate::middleware; update any
other places (including the code around the other occurrences referenced near
lines 155-159) to import constant_time_eq from the new shared module so that
AppConfig and other low-level types no longer depend on the HTTP middleware
layer.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d7a7aaee-d910-4429-86e1-607afaeba128

📥 Commits

Reviewing files that changed from the base of the PR and between 2014414 and 952b0c1.

📒 Files selected for processing (7)
  • .env.example
  • src/app.rs
  • src/config.rs
  • src/error.rs
  • src/middleware/captcha.rs
  • tests/api/contributions_tests.rs
  • tests/layer_unit_tests.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/error.rs

Comment thread .env.example
Comment thread src/config.rs Outdated
@nxdun
Copy link
Copy Markdown
Owner Author

nxdun commented Apr 27, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 27, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/layer_unit_tests.rs (1)

34-34: Consider using the exported X_API_KEY constant.

The middleware exports X_API_KEY as a typed HeaderName constant. Using it here would ensure test consistency if the header name ever changes.

♻️ Suggested change
+use nadzu::middleware::X_API_KEY;
+
 // In test functions:
-headers.insert("x-api-key", "master_key".parse().unwrap());
+headers.insert(X_API_KEY.clone(), "master_key".parse().unwrap());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/layer_unit_tests.rs` at line 34, Replace the hard-coded header name
string in the test's headers.insert call with the exported X_API_KEY HeaderName
constant from the middleware (use X_API_KEY instead of "x-api-key"), and add the
necessary use/import for X_API_KEY so the test compiles; keep the header value
("master_key") as-is when calling headers.insert on the same headers variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/layer_unit_tests.rs`:
- Line 34: Replace the hard-coded header name string in the test's
headers.insert call with the exported X_API_KEY HeaderName constant from the
middleware (use X_API_KEY instead of "x-api-key"), and add the necessary
use/import for X_API_KEY so the test compiles; keep the header value
("master_key") as-is when calling headers.insert on the same headers variable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7e191080-4a7a-4453-b277-617d7ecb054b

📥 Commits

Reviewing files that changed from the base of the PR and between 952b0c1 and 45e38e5.

📒 Files selected for processing (3)
  • src/config.rs
  • tests/api/contributions_tests.rs
  • tests/layer_unit_tests.rs

@nxdun nxdun merged commit 554569c into main Apr 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant