Skip to content

feat: API Health Dashboard with BullMQ monitoring + admin notifications#69

Merged
Systemsaholic merged 12 commits intomainfrom
feature/api-health-dashboard
Mar 21, 2026
Merged

feat: API Health Dashboard with BullMQ monitoring + admin notifications#69
Systemsaholic merged 12 commits intomainfrom
feature/api-health-dashboard

Conversation

@Systemsaholic
Copy link
Copy Markdown
Owner

@Systemsaholic Systemsaholic commented Mar 21, 2026

Summary

Replaces the static API Credentials page with an active health monitoring dashboard.

Backend

  • New BullMQ queue (api-health) — runs health checks every 15 min against all 15 external APIs
  • New table (api_health_checks) — stores check results with 7-day retention
  • 15 providers monitored: Amadeus, AeroDataBox, Traveltek, Globus, Google Places, Stripe, Resend, Unsplash, OpenAI, ExchangeRate-API, Cloudflare R2, Supabase Storage, Backblaze B2, Redis, Traveltek FTP
  • Admin notifications on 2 consecutive failures + recovery notifications
  • 3 new endpoints: GET status, GET history, POST trigger check (all @adminonly)

Frontend

  • Status grid grouped by category (Travel, Services, Storage, Infrastructure)
  • Status dots: gray (not configured), green (healthy), amber (1 failure), red (2+ failures)
  • "Test" button per provider for immediate checks
  • Expandable card with last 10 check history
  • Auto-refreshes every 60 seconds

Changes

  • Exported TraveltekAuthService from CruiseBookingModule
  • Renamed tab "API Credentials" → "API Health"
  • 14 new files, 1 migration

Test plan

  • TypeScript compiles cleanly
  • Migration runs successfully
  • Deployed to preview
  • Health checks run on 15-min schedule
  • Cards show correct status per provider
  • Test button triggers immediate check
  • Notification sent on consecutive failures

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • API Health dashboard replaces the credentials UI: provider cards with status badges, last-checked/response times, expandable recent checks, and per-provider "Test" action
    • Admin API and background scheduled checks with persisted results and automated alerting on repeated failures
  • Updated

    • Settings tab label changed from "API Credentials" to "API Health"
  • Database

    • Added table and index to store API health check results
  • Documentation

    • Added design/plan docs for "Add/Create Trip in Group" UX and backend support

Systemsaholic and others added 6 commits March 21, 2026 10:26
Create migration and schema for storing periodic health check results
for all external API providers. Includes provider+time index for
efficient queries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add TraveltekAuthService to CruiseBookingModule exports so it can be
injected by ApiHealthModule for Traveltek FusionAPI health checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement health check logic for 15 external providers across 4
categories (travel, services, storage, infrastructure). Each check
measures response time, stores results in DB, and notifies admins
on 2 consecutive failures via NotificationService.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Handles check-all (scheduled every 15 min), check-provider (immediate
single check), and cleanup (daily removal of checks older than 7 days).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Controller: 3 admin-only endpoints (status, history, trigger check)
- Module: registers api-health BullMQ queue, schedules repeatable jobs
  (every 15 min check-all, daily 3 AM cleanup)
- Fix: use gt() instead of lt() for date comparison in getProviderHistory
- Register ApiHealthModule in AppModule

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create apps/admin/src/hooks/use-api-health.ts with useApiHealthStatus,
  useApiHealthHistory, and useTriggerHealthCheck hooks using React Query,
  matching existing use-api-credentials.ts patterns
- Rewrite settings/api-credentials/page.tsx as a health status grid:
  provider cards grouped by category (Travel, Services, Storage,
  Infrastructure), status dots (gray/green/amber/red), last check time,
  response time in ms, Test button per card, expandable history panel
- Rename "API Credentials" tab to "API Health" in settings-tabs-layout.tsx
  (route path unchanged: /settings/api-credentials)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
tailfire-client Ready Ready Preview, Comment Mar 21, 2026 7:00pm
tailfire-ota Ready Ready Preview, Comment Mar 21, 2026 7:00pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 21, 2026

📝 Walkthrough

Walkthrough

Replaces the admin "API Credentials" page with an API Health dashboard and adds a complete ApiHealth feature: backend controller, service, processor, module, database table/migration, React hooks, scheduled jobs, and notifications.

Changes

Cohort / File(s) Summary
Admin UI
apps/admin/src/app/settings/_components/settings-tabs-layout.tsx, apps/admin/src/app/settings/api-credentials/page.tsx
Renamed settings tab label to "API Health" and replaced the API Credentials UI with an API Health dashboard (provider cards, status badges, Test button, recent checks panel, summary bar).
Frontend Hooks
apps/admin/src/hooks/use-api-health.ts
Added React Query hooks, types, and cache-key utilities to fetch provider status, provider history, and trigger manual health checks with auto-refetch and mutation invalidation.
API Controller
apps/api/src/api-health/api-health.controller.ts
New admin-only controller exposing GET /admin/api-health, GET /admin/api-health/:provider/history, and POST /admin/api-health/:provider/check.
ApiHealth Core
apps/api/src/api-health/api-health.service.ts, apps/api/src/api-health/api-health.types.ts
New service and types implementing provider registry and per-provider probes (~16 providers), single/all checks, persistence to api_health_checks, consecutive-failure tracking, history queries, notifications, and cleanup.
Worker, Module & Scheduling
apps/api/src/api-health/api-health.processor.ts, apps/api/src/api-health/api-health.module.ts, apps/api/src/app.module.ts
Added BullMQ worker (check-all, check-provider, cleanup), ApiHealthModule wiring (HttpModule, queue, dependencies), scheduled jobs (every 15 minutes checks, daily cleanup), and registered module in AppModule.
Database schema & migration
packages/database/src/migrations/20260321120000_create_api_health_checks.sql, packages/database/src/migrations/meta/_journal.json, packages/database/src/schema/api-health-checks.schema.ts, packages/database/src/schema/index.ts
New migration and Drizzle schema for api_health_checks table with index; migration journal updated and schema re-exported.
Module export change
apps/api/src/cruise-booking/cruise-booking.module.ts
Exported TraveltekAuthService from CruiseBookingModule to allow reuse by health checks.
External module import
apps/api/src/external-apis/providers/aerodatabox/aerodatabox.module.ts
Imported ApiCredentialsModule into AerodataboxModule to access credentials.
Docs / Plans / Specs
docs/superpowers/plans/2026-03-21-add-trip-to-group.md, docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md
Added planning and spec docs for an unrelated "Add trip to group" feature (UX, API filter ungrouped, TripFormDialog prop extensions).

Sequence Diagram(s)

sequenceDiagram
    participant Admin as Admin UI
    participant Controller as ApiHealthController
    participant Service as ApiHealthService
    participant DB as Database
    participant Queue as BullMQ
    participant Worker as ApiHealthProcessor
    participant Provider as External Provider

    rect rgba(120,200,160,0.5)
        Note over Admin,Controller: Manual check triggered from UI
        Admin->>Controller: POST /admin/api-health/:provider/check
        Controller->>Service: checkProvider(provider)
        Service->>Provider: execute provider-specific probe
        Provider-->>Service: response / latency / error
        Service->>DB: insert api_health_checks row
        Service->>DB: query recent checks for consecutive failures
        alt consecutive failures detected
            Service->>Service: notifyAdmins(provider)
        end
        Service-->>Controller: HealthCheckResult
        Controller-->>Admin: return result
    end

    rect rgba(160,160,220,0.5)
        Note over Queue,Worker: Scheduled checks
        Queue->>Worker: schedule check-all (cron every 15m)
        Worker->>Service: checkAllProviders()
        loop per configured provider
            Service->>Provider: probe provider
            Provider-->>Service: result
            Service->>DB: persist result
        end
        Worker->>Service: checkAndNotify() for failures
    end

    rect rgba(200,170,120,0.5)
        Note over Admin,Controller: Status fetch for dashboard
        Admin->>Controller: GET /admin/api-health
        Controller->>Service: getLatestStatus()
        Service->>DB: query latest checks per provider
        DB-->>Service: rows
        Service-->>Controller: ProviderStatus[]
        Controller-->>Admin: render provider cards
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I hopped through queues and HTTP pings,

I checked each provider's tiny wings,
Green, amber, red — the cards all show,
I log the misses, let alerts go,
A carrot dashboard keeps the system in spring.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing an API Health Dashboard with BullMQ monitoring and admin notifications, which is reflected throughout the changeset with backend health checks, a new frontend dashboard page, and notification infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/api-health-dashboard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (11)
apps/admin/src/app/settings/_components/settings-tabs-layout.tsx (1)

21-21: Consider aligning the tab id and href with the new "API Health" label.

The label now says "API Health" but the id (api-credentials) and href (/settings/api-credentials) still reference "credentials". This semantic mismatch could cause maintenance confusion. If URL stability is required, this can be deferred, but consider renaming in a future iteration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/app/settings/_components/settings-tabs-layout.tsx` at line 21,
The tab entry with id 'api-credentials' and href '/settings/api-credentials' no
longer matches its label 'API Health'; update the tab object's id and href to
reflect the new label (e.g., change id to 'api-health' and href to
'/settings/api-health') wherever the tab is defined and ensure any consumers of
the id/href (routing, active-tab logic) are updated to the new values (look for
uses of 'api-credentials' in the component and related routing/consumer code and
rename them consistently), or if URL stability is required, add a comment
explaining the intentional mismatch.
apps/admin/src/app/settings/api-credentials/page.tsx (1)

289-307: Consider capturing errors with Sentry per coding guidelines.

The error is shown to the user via toast, but runtime errors in the admin application should also be captured by Sentry for monitoring.

As per coding guidelines: "Admin application must capture runtime errors with Sentry."

♻️ Proposed fix
+import * as Sentry from '@sentry/nextjs'

     try {
       await triggerCheck.mutateAsync(provider)
       toast({
         title: 'Check triggered',
         description: `Health check for ${provider} is running.`,
       })
-    } catch {
+    } catch (err) {
+      Sentry.captureException(err)
       toast({
         title: 'Check failed',
         description: `Could not trigger health check for ${provider}.`,
         variant: 'destructive',
       })
     } finally {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/app/settings/api-credentials/page.tsx` around lines 289 - 307,
The catch block that handles triggerCheck.mutateAsync(provider) should capture
the thrown error with Sentry before showing the toast; update the catch to
accept an error parameter and call Sentry.captureException(error) (or the app's
sentry wrapper) and include context (e.g., { provider }) so runtime failures in
triggerCheck.mutateAsync are reported; keep the existing toast and the finally
behavior that calls setTestingProviders to remove provider.
apps/api/src/api-health/api-health.module.ts (2)

47-72: Consider failing module initialization if job scheduling fails.

The current error handling logs the failure but allows the module to initialize successfully. If the health check scheduler fails to register, the monitoring system will be silently non-functional until someone notices missing checks.

♻️ Option: Re-throw to fail fast
     } catch (error: any) {
       this.logger.error(`Failed to schedule health check jobs: ${error.message}`)
+      throw error // Fail fast — monitoring is a critical feature
     }

Alternatively, if you want the API to remain available without monitoring, consider adding a health indicator or metric that surfaces this failure state to operators.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.module.ts` around lines 47 - 72, The
onModuleInit error handling currently swallows scheduling failures
(healthQueue.upsertJobScheduler) by only logging via this.logger.error, so
change it to fail module initialization: after catching the error in
onModuleInit, re-throw (or throw a new Error with context) so the module
bootstrap fails; alternatively, if you prefer not to crash, set a health
indicator/metric inside the catch to mark the scheduler as unhealthy (e.g.,
update a health service) and still log the full error using error.stack or
error.message; reference onModuleInit, healthQueue.upsertJobScheduler, and
this.logger.error to find where to apply the change.

60-69: Cleanup job runs at 3 AM in server timezone — verify this is intentional.

The cron pattern 0 3 * * * will execute in the server's local timezone. Based on learnings for automation jobs, timezone-aware scheduling is recommended for scheduled jobs. If your servers run in UTC but admins expect 3 AM in a specific timezone, consider making this explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.module.ts` around lines 60 - 69, The
scheduled cleanup uses a server-local cron ("0 3 * * *") in
healthQueue.upsertJobScheduler for job id 'api-health-cleanup', which may not
run at the intended timezone; update the upsertJobScheduler call to use a
timezone-aware schedule (e.g., add a timezone option or convert the cron to UTC)
so the job runs at 3 AM in the intended timezone rather than the server's local
TZ—locate the call to healthQueue.upsertJobScheduler and add the timezone
parameter or adjust the cron accordingly and include a short comment explaining
the chosen timezone.
apps/api/src/api-health/api-health.controller.ts (1)

30-34: Consider validating the provider parameter.

Both endpoints accept any string and cast it to HealthProviderKey. Invalid provider names will either throw inside the service or return empty results. A validation pipe or explicit check would provide clearer error messages.

Also applies to: 40-41

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.controller.ts` around lines 30 - 34, The
controller currently casts the route param to HealthProviderKey without
validation; update the methods that call healthService.getProviderHistory and
healthService.getProviderStatus to validate the provider param first (either add
a Nest validation pipe such as ParseEnumPipe for HealthProviderKey on the
`@Param`('provider') or perform an explicit check against the HealthProviderKey
values/keys) and throw a BadRequestException with a clear message when the
provider is invalid so callers receive a 400 instead of downstream errors or
empty results.
apps/admin/src/hooks/use-api-health.ts (1)

42-48: Consider fetching history only when the panel is expanded.

useApiHealthHistory is called for every provider in HistoryPanel, but the history panel only renders when the Collapsible is open. Setting enabled: false by default and only enabling when the panel is expanded would reduce unnecessary API calls.

Looking at the page component, HistoryPanel is rendered inside CollapsibleContent, so React Query will still fire the request even though the content isn't visible yet.

♻️ Pass enabled state from parent

In page.tsx, pass the open state:

<HistoryPanel provider={provider.provider} enabled={open} />

Update HistoryPanel:

function HistoryPanel({ provider, enabled }: { provider: string; enabled: boolean }) {
  const { data: history, isLoading } = useApiHealthHistory(provider, { enabled })
  // ...
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/hooks/use-api-health.ts` around lines 42 - 48,
useApiHealthHistory is unconditionally triggering network requests for every
provider because it's enabled by default; change it to default enabled: false
and wire the Collapsible open state from the parent into HistoryPanel so the
query only runs when the panel is expanded. Specifically, update
useApiHealthHistory to accept options with enabled defaulting to false
(useApiHealthHistory(provider, { enabled?: boolean })), modify the page
component to pass the Collapsible/CollapsibleContent open state into
HistoryPanel (e.g., <HistoryPanel provider={provider.provider} enabled={open}
/>), and update HistoryPanel to forward that enabled flag into
useApiHealthHistory so the React Query call (queryKey:
apiHealthKeys.history(provider)) only executes when enabled is true.
apps/api/src/api-health/api-health.service.ts (4)

538-553: Redis client cleanup may leave dangling connection on early failure.

If client.connect() throws, the finally block still calls quit(), which is fine. However, if there's a timeout during connect(), the client may remain in a connecting state. Consider adding a destruction timeout.

♻️ More defensive cleanup
   const client = new Redis({
     ...config,
     connectTimeout: 5000,
     lazyConnect: true,
+    maxRetriesPerRequest: 1,
   })

   try {
     await client.connect()
     const pong = await client.ping()
     if (pong !== 'PONG') {
       throw new Error(`Redis PING returned: ${pong}`)
     }
   } finally {
-    await client.quit().catch(() => {})
+    client.disconnect() // Force disconnect without waiting
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.service.ts` around lines 538 - 553, The
Redis health check can leave the client in a connecting state if connect() times
out; update the logic around the const client = new Redis(...) and the
try/finally so connect is wrapped in a cancellable/timeout (e.g., Promise.race
with a short timeout) and if the timeout fires or connect rejects ensure you
forcefully close the client (call client.disconnect() or client.quit() if
available, after checking client.status or catching errors) before rethrowing;
also ensure the final cleanup in the existing finally still safely handles
already-closed clients by checking/ignoring errors on quit/disconnect.

101-110: Consider parallelizing provider checks for faster execution.

With 15+ providers and a 10-second timeout each, sequential checking could take over 2.5 minutes worst case. Using Promise.allSettled would allow concurrent checks while still capturing individual failures.

♻️ Parallel execution with controlled concurrency
 async checkAllProviders(): Promise<HealthCheckResult[]> {
-  const results: HealthCheckResult[] = []
-  for (const provider of HEALTH_PROVIDERS) {
-    if (!this.isConfigured(provider.key)) continue
-    const result = await this.checkProvider(provider.key)
-    await this.checkAndNotify(provider.key, result)
-    results.push(result)
-  }
-  return results
+  const configuredProviders = HEALTH_PROVIDERS.filter(p => this.isConfigured(p.key))
+  
+  const results = await Promise.all(
+    configuredProviders.map(async (provider) => {
+      const result = await this.checkProvider(provider.key)
+      await this.checkAndNotify(provider.key, result)
+      return result
+    })
+  )
+  
+  return results
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.service.ts` around lines 101 - 110, The
checkAllProviders function currently iterates providers sequentially causing
long total latency; refactor it to run provider checks concurrently by mapping
configured HEALTH_PROVIDERS to Promises that call checkProvider(provider.key)
and then checkAndNotify(provider.key, result), and use Promise.allSettled (or a
bounded concurrency helper like p-limit if you want to limit parallelism) to
await completion; for each settled promise, convert fulfilled values into
HealthCheckResult and push them into results while handling rejected ones by
creating a failed HealthCheckResult (including error info) so every provider
yields an entry in results.

115-166: N+1 query pattern in getLatestStatus — consider batching.

For each of the 15+ providers, this method makes 1-2 database queries (latest check + potentially recent checks for failure counting). This could be optimized with a single query using window functions or a grouped aggregation.

This is acceptable for an admin-only endpoint with 15 providers, but could become slow if provider count grows significantly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.service.ts` around lines 115 - 166,
getLatestStatus currently issues per-provider DB queries (select latest check
and up to 10 recent checks) causing an N+1 pattern; instead, batch the work by
querying apiHealthChecks once or twice for all providers and compute latestCheck
and consecutiveFailures in-memory. Use a single query with a window function
(row_number() over(partition by provider order by checkedAt desc)) or a grouped
aggregation to fetch the latest check for every provider in one call, and
similarly fetch the recent checks (limit 10 per provider) in one call (or use
array_agg to collect recent results per provider) via
this.db.client.select().from(this.db.schema.apiHealthChecks), then in
getLatestStatus iterate HEALTH_PROVIDERS and map the pre-fetched results to
build lastCheck and consecutiveFailures instead of calling the DB inside the
loop.

555-573: Traveltek FTP check only verifies DNS, not service availability.

The comment explains the rationale (avoiding FTP connection limits), but this means the "health check" won't detect FTP server downtime, authentication issues, or firewall blocks. Consider documenting this limitation in the UI or using a different status indicator.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.service.ts` around lines 555 - 573, The
checkTraveltekFtp method currently only verifies DNS via dns.lookup, so the
health check cannot detect FTP service downtime, auth failures, or firewall
issues; update the health behavior by either (a) documenting this limitation in
the health UI/status description and changing the reported state from "healthy"
to a degraded/warning status when checkTraveltekFtp passes only DNS, or (b) add
an opt-in deeper check behind a feature flag that attempts a short-timeout,
read-only FTP connect using the same credentials (implement in checkTraveltekFtp
and gate with a config flag from configService) so callers of the
ApiHealthService get an accurate up/down result depending on configuration.
apps/api/src/api-health/api-health.processor.ts (1)

39-39: Avoid as any — use the proper type.

♻️ Use HealthProviderKey type
+import type { HealthProviderKey } from './api-health.types'
+
 case 'check-provider': {
-  const { provider } = job.data as { provider: string }
+  const { provider } = job.data as { provider: HealthProviderKey }
   this.logger.log(`Running health check for provider: ${provider}`)
-  const result = await this.healthService.checkProvider(provider as any)
+  const result = await this.healthService.checkProvider(provider)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.processor.ts` at line 39, The call to
this.healthService.checkProvider currently uses an unsafe cast (provider as
any); replace that with the proper HealthProviderKey type by typing the provider
variable or by casting to HealthProviderKey (e.g., provider as
HealthProviderKey) and import HealthProviderKey where needed; if checkProvider’s
signature doesn't accept HealthProviderKey, update the method signature on
HealthService to accept HealthProviderKey so no any cast is required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/admin/src/app/settings/api-credentials/page.tsx`:
- Around line 170-172: The render currently checks responseMs inconsistently
against undefined/null for HealthCheckEntry.responseMs (type number | null);
update the rendering guards that reference responseMs (both the span near the
earlier block and the later usage around lines 201-206) to explicitly ensure
responseMs is neither undefined nor null — e.g., change the condition to check
responseMs !== undefined && responseMs !== null (or responseMs != null) before
rendering the <span> or using responseMs so both occurrences consistently handle
the nullable number.

In `@apps/api/src/api-health/api-health.controller.ts`:
- Around line 29-35: getProviderHistory currently uses parseInt on the hours
query and can pass NaN to healthService; validate the hours param in the
controller (getProviderHistory): if hours is provided, attempt to parseInt and
if the result is NaN or <= 0 throw a BadRequestException with a clear message,
otherwise use the parsed integer (fallback to 24 when undefined) and call
this.healthService.getProviderHistory(provider as HealthProviderKey,
parsedHours); import BadRequestException from `@nestjs/common`.

In `@apps/api/src/api-health/api-health.processor.ts`:
- Around line 36-45: The 'check-provider' branch in ApiHealthProcessor is dead
code and should be removed: delete the entire case 'check-provider' block (the
job.data extraction, calls to healthService.checkProvider and
healthService.checkAndNotify, and its logs) to avoid an unused processor path;
if you instead want queued manual checks, change the controller to enqueue a job
named 'check-provider' rather than calling healthService.checkProvider()
directly. Also replace the loose type assertion on provider (currently "as any")
with the proper HealthProviderKey type when reading job.data or handling
provider input.

In `@apps/api/src/api-health/api-health.service.ts`:
- Around line 604-606: The catch block that currently only logs errors via
this.logger.error when failing to notify admins about a provider (the code
referencing providerName) must also report the exception to Sentry; update the
catch to call Sentry.captureException(err) (or the project's Sentry wrapper) and
include relevant context (providerName) in the capture so runtime failures are
recorded by Sentry in addition to being logged with this.logger.error.

---

Nitpick comments:
In `@apps/admin/src/app/settings/_components/settings-tabs-layout.tsx`:
- Line 21: The tab entry with id 'api-credentials' and href
'/settings/api-credentials' no longer matches its label 'API Health'; update the
tab object's id and href to reflect the new label (e.g., change id to
'api-health' and href to '/settings/api-health') wherever the tab is defined and
ensure any consumers of the id/href (routing, active-tab logic) are updated to
the new values (look for uses of 'api-credentials' in the component and related
routing/consumer code and rename them consistently), or if URL stability is
required, add a comment explaining the intentional mismatch.

In `@apps/admin/src/app/settings/api-credentials/page.tsx`:
- Around line 289-307: The catch block that handles
triggerCheck.mutateAsync(provider) should capture the thrown error with Sentry
before showing the toast; update the catch to accept an error parameter and call
Sentry.captureException(error) (or the app's sentry wrapper) and include context
(e.g., { provider }) so runtime failures in triggerCheck.mutateAsync are
reported; keep the existing toast and the finally behavior that calls
setTestingProviders to remove provider.

In `@apps/admin/src/hooks/use-api-health.ts`:
- Around line 42-48: useApiHealthHistory is unconditionally triggering network
requests for every provider because it's enabled by default; change it to
default enabled: false and wire the Collapsible open state from the parent into
HistoryPanel so the query only runs when the panel is expanded. Specifically,
update useApiHealthHistory to accept options with enabled defaulting to false
(useApiHealthHistory(provider, { enabled?: boolean })), modify the page
component to pass the Collapsible/CollapsibleContent open state into
HistoryPanel (e.g., <HistoryPanel provider={provider.provider} enabled={open}
/>), and update HistoryPanel to forward that enabled flag into
useApiHealthHistory so the React Query call (queryKey:
apiHealthKeys.history(provider)) only executes when enabled is true.

In `@apps/api/src/api-health/api-health.controller.ts`:
- Around line 30-34: The controller currently casts the route param to
HealthProviderKey without validation; update the methods that call
healthService.getProviderHistory and healthService.getProviderStatus to validate
the provider param first (either add a Nest validation pipe such as
ParseEnumPipe for HealthProviderKey on the `@Param`('provider') or perform an
explicit check against the HealthProviderKey values/keys) and throw a
BadRequestException with a clear message when the provider is invalid so callers
receive a 400 instead of downstream errors or empty results.

In `@apps/api/src/api-health/api-health.module.ts`:
- Around line 47-72: The onModuleInit error handling currently swallows
scheduling failures (healthQueue.upsertJobScheduler) by only logging via
this.logger.error, so change it to fail module initialization: after catching
the error in onModuleInit, re-throw (or throw a new Error with context) so the
module bootstrap fails; alternatively, if you prefer not to crash, set a health
indicator/metric inside the catch to mark the scheduler as unhealthy (e.g.,
update a health service) and still log the full error using error.stack or
error.message; reference onModuleInit, healthQueue.upsertJobScheduler, and
this.logger.error to find where to apply the change.
- Around line 60-69: The scheduled cleanup uses a server-local cron ("0 3 * *
*") in healthQueue.upsertJobScheduler for job id 'api-health-cleanup', which may
not run at the intended timezone; update the upsertJobScheduler call to use a
timezone-aware schedule (e.g., add a timezone option or convert the cron to UTC)
so the job runs at 3 AM in the intended timezone rather than the server's local
TZ—locate the call to healthQueue.upsertJobScheduler and add the timezone
parameter or adjust the cron accordingly and include a short comment explaining
the chosen timezone.

In `@apps/api/src/api-health/api-health.processor.ts`:
- Line 39: The call to this.healthService.checkProvider currently uses an unsafe
cast (provider as any); replace that with the proper HealthProviderKey type by
typing the provider variable or by casting to HealthProviderKey (e.g., provider
as HealthProviderKey) and import HealthProviderKey where needed; if
checkProvider’s signature doesn't accept HealthProviderKey, update the method
signature on HealthService to accept HealthProviderKey so no any cast is
required.

In `@apps/api/src/api-health/api-health.service.ts`:
- Around line 538-553: The Redis health check can leave the client in a
connecting state if connect() times out; update the logic around the const
client = new Redis(...) and the try/finally so connect is wrapped in a
cancellable/timeout (e.g., Promise.race with a short timeout) and if the timeout
fires or connect rejects ensure you forcefully close the client (call
client.disconnect() or client.quit() if available, after checking client.status
or catching errors) before rethrowing; also ensure the final cleanup in the
existing finally still safely handles already-closed clients by
checking/ignoring errors on quit/disconnect.
- Around line 101-110: The checkAllProviders function currently iterates
providers sequentially causing long total latency; refactor it to run provider
checks concurrently by mapping configured HEALTH_PROVIDERS to Promises that call
checkProvider(provider.key) and then checkAndNotify(provider.key, result), and
use Promise.allSettled (or a bounded concurrency helper like p-limit if you want
to limit parallelism) to await completion; for each settled promise, convert
fulfilled values into HealthCheckResult and push them into results while
handling rejected ones by creating a failed HealthCheckResult (including error
info) so every provider yields an entry in results.
- Around line 115-166: getLatestStatus currently issues per-provider DB queries
(select latest check and up to 10 recent checks) causing an N+1 pattern;
instead, batch the work by querying apiHealthChecks once or twice for all
providers and compute latestCheck and consecutiveFailures in-memory. Use a
single query with a window function (row_number() over(partition by provider
order by checkedAt desc)) or a grouped aggregation to fetch the latest check for
every provider in one call, and similarly fetch the recent checks (limit 10 per
provider) in one call (or use array_agg to collect recent results per provider)
via this.db.client.select().from(this.db.schema.apiHealthChecks), then in
getLatestStatus iterate HEALTH_PROVIDERS and map the pre-fetched results to
build lastCheck and consecutiveFailures instead of calling the DB inside the
loop.
- Around line 555-573: The checkTraveltekFtp method currently only verifies DNS
via dns.lookup, so the health check cannot detect FTP service downtime, auth
failures, or firewall issues; update the health behavior by either (a)
documenting this limitation in the health UI/status description and changing the
reported state from "healthy" to a degraded/warning status when
checkTraveltekFtp passes only DNS, or (b) add an opt-in deeper check behind a
feature flag that attempts a short-timeout, read-only FTP connect using the same
credentials (implement in checkTraveltekFtp and gate with a config flag from
configService) so callers of the ApiHealthService get an accurate up/down result
depending on configuration.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dc9d2355-bcc5-44a6-8e1f-851760a68c96

📥 Commits

Reviewing files that changed from the base of the PR and between 87a8bff and 54ffebb.

📒 Files selected for processing (14)
  • apps/admin/src/app/settings/_components/settings-tabs-layout.tsx
  • apps/admin/src/app/settings/api-credentials/page.tsx
  • apps/admin/src/hooks/use-api-health.ts
  • apps/api/src/api-health/api-health.controller.ts
  • apps/api/src/api-health/api-health.module.ts
  • apps/api/src/api-health/api-health.processor.ts
  • apps/api/src/api-health/api-health.service.ts
  • apps/api/src/api-health/api-health.types.ts
  • apps/api/src/app.module.ts
  • apps/api/src/cruise-booking/cruise-booking.module.ts
  • packages/database/src/migrations/20260321120000_create_api_health_checks.sql
  • packages/database/src/migrations/meta/_journal.json
  • packages/database/src/schema/api-health-checks.schema.ts
  • packages/database/src/schema/index.ts

Comment on lines +170 to +172
{responseMs !== undefined && responseMs !== null && (
<span>{responseMs}ms</span>
)}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Null check may be incomplete for responseMs.

The type HealthCheckEntry.responseMs is number | null, but line 170-171 only checks !== undefined. This works because null !== undefined is true, but the explicit && responseMs !== null on line 171 is correct. However, line 201 uses only responseMs !== undefined without the null check, which is inconsistent.

🐛 Proposed fix for consistency
-          {(lastCheckedLabel || responseMs !== undefined) && (
+          {(lastCheckedLabel || (responseMs !== undefined && responseMs !== null)) && (

Also applies to: 201-206

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/app/settings/api-credentials/page.tsx` around lines 170 - 172,
The render currently checks responseMs inconsistently against undefined/null for
HealthCheckEntry.responseMs (type number | null); update the rendering guards
that reference responseMs (both the span near the earlier block and the later
usage around lines 201-206) to explicitly ensure responseMs is neither undefined
nor null — e.g., change the condition to check responseMs !== undefined &&
responseMs !== null (or responseMs != null) before rendering the <span> or using
responseMs so both occurrences consistently handle the nullable number.

Comment on lines +29 to +35
async getProviderHistory(
@Param('provider') provider: string,
@Query('hours') hours?: string,
) {
const h = hours ? parseInt(hours, 10) : 24
return this.healthService.getProviderHistory(provider as HealthProviderKey, h)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add input validation for hours parameter.

If hours is a non-numeric string like "abc", parseInt returns NaN, which will compute an invalid since date in getProviderHistory. Consider adding validation:

🛡️ Proposed fix
   async getProviderHistory(
     `@Param`('provider') provider: string,
     `@Query`('hours') hours?: string,
   ) {
     const h = hours ? parseInt(hours, 10) : 24
+    if (isNaN(h) || h <= 0) {
+      throw new BadRequestException('hours must be a positive number')
+    }
     return this.healthService.getProviderHistory(provider as HealthProviderKey, h)
   }

Don't forget to import BadRequestException from @nestjs/common.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async getProviderHistory(
@Param('provider') provider: string,
@Query('hours') hours?: string,
) {
const h = hours ? parseInt(hours, 10) : 24
return this.healthService.getProviderHistory(provider as HealthProviderKey, h)
}
async getProviderHistory(
`@Param`('provider') provider: string,
`@Query`('hours') hours?: string,
) {
const h = hours ? parseInt(hours, 10) : 24
if (isNaN(h) || h <= 0) {
throw new BadRequestException('hours must be a positive number')
}
return this.healthService.getProviderHistory(provider as HealthProviderKey, h)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.controller.ts` around lines 29 - 35,
getProviderHistory currently uses parseInt on the hours query and can pass NaN
to healthService; validate the hours param in the controller
(getProviderHistory): if hours is provided, attempt to parseInt and if the
result is NaN or <= 0 throw a BadRequestException with a clear message,
otherwise use the parsed integer (fallback to 24 when undefined) and call
this.healthService.getProviderHistory(provider as HealthProviderKey,
parsedHours); import BadRequestException from `@nestjs/common`.

Comment on lines +36 to +45
case 'check-provider': {
const { provider } = job.data as { provider: string }
this.logger.log(`Running health check for provider: ${provider}`)
const result = await this.healthService.checkProvider(provider as any)
await this.healthService.checkAndNotify(provider, result)
this.logger.log(
`Health check for ${provider}: ${result.success ? 'OK' : 'FAILED'} (${result.responseMs}ms)`,
)
break
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for any code that adds 'check-provider' jobs to the queue
rg -n "check-provider" --type=ts

Repository: Systemsaholic/tailfire

Length of output: 265


🏁 Script executed:

# Find and examine the api-health controller
find . -name "api-health.controller.ts" -type f | head -20

Repository: Systemsaholic/tailfire

Length of output: 116


🏁 Script executed:

# Look at the full processor file to understand the context
cat -n apps/api/src/api-health/api-health.processor.ts | head -70

Repository: Systemsaholic/tailfire

Length of output: 2559


🏁 Script executed:

# Search for any queue.add or job enqueueing patterns for health checks
rg -n "queue\.add|enqueue" apps/api/src/api-health/ --type=ts -A 2

Repository: Systemsaholic/tailfire

Length of output: 48


🏁 Script executed:

# Search for references to the processor or job queue in the codebase
rg -n "api-health.*queue|'api-health'" --type=ts | grep -v node_modules

Repository: Systemsaholic/tailfire

Length of output: 405


🏁 Script executed:

# Examine the api-health controller
cat -n apps/api/src/api-health/api-health.controller.ts

Repository: Systemsaholic/tailfire

Length of output: 1923


🏁 Script executed:

# Search for any code that calls queue.add in the entire api-health folder
rg -n "\.add\(" apps/api/src/api-health/ --type=ts -B 2 -A 2

Repository: Systemsaholic/tailfire

Length of output: 48


🏁 Script executed:

# Check the api-health.module to see how jobs are scheduled
cat -n apps/api/src/api-health/api-health.module.ts

Repository: Systemsaholic/tailfire

Length of output: 2949


The check-provider case is dead code — it's defined in the processor but never enqueued.

The controller at api-health.controller.ts:37-44 handles provider checks synchronously by calling healthService.checkProvider() directly, bypassing the queue. The module schedules only two jobs: check-all (every 15 minutes) and cleanup (daily). No code in the codebase enqueues check-provider jobs.

Consider removing this unused case unless you plan to move manual checks to the queue for consistency and observability.

Also fix the type assertion on line 39: use HealthProviderKey instead of as any.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.processor.ts` around lines 36 - 45, The
'check-provider' branch in ApiHealthProcessor is dead code and should be
removed: delete the entire case 'check-provider' block (the job.data extraction,
calls to healthService.checkProvider and healthService.checkAndNotify, and its
logs) to avoid an unused processor path; if you instead want queued manual
checks, change the controller to enqueue a job named 'check-provider' rather
than calling healthService.checkProvider() directly. Also replace the loose type
assertion on provider (currently "as any") with the proper HealthProviderKey
type when reading job.data or handling provider input.

Comment on lines +604 to +606
} catch (err) {
this.logger.error(`Failed to notify admins about ${providerName} failure: ${err}`)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Capture notification failures with Sentry per coding guidelines.

The error is logged but not captured by Sentry. Per coding guidelines: "All runtime errors in the API must be captured by Sentry."

🛡️ Proposed fix
+import * as Sentry from '@sentry/node'

     } catch (err) {
+      Sentry.captureException(err, {
+        tags: { provider: providerName },
+        extra: { error },
+      })
       this.logger.error(`Failed to notify admins about ${providerName} failure: ${err}`)
     }

As per coding guidelines: "All runtime errors in the API must be captured by Sentry."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.service.ts` around lines 604 - 606, The
catch block that currently only logs errors via this.logger.error when failing
to notify admins about a provider (the code referencing providerName) must also
report the exception to Sentry; update the catch to call
Sentry.captureException(err) (or the project's Sentry wrapper) and include
relevant context (providerName) in the capture so runtime failures are recorded
by Sentry in addition to being logged with this.logger.error.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/admin/src/app/settings/api-credentials/page.tsx`:
- Around line 148-152: The accessible label for the expand/collapse button is
generic; update the aria-label in the button inside the CollapsibleTrigger so it
includes the provider identity (e.g., use provider.name) to disambiguate for
screen readers — change the aria-label to include provider.name (for example:
`${open ? 'Collapse' : 'Expand'} ${provider.name} history`) in the button
rendered by the CollapsibleTrigger.
- Around line 28-33: The current getStatus function conflates unconfigured
providers and newly-configured ones because it returns 'not-configured' when
configured && !lastCheck; change getStatus (function getStatus(provider:
ProviderStatus): HealthStatus) so that it returns 'not-configured' only when
provider.configured is false, and returns a distinct status (e.g.,
'awaiting-check' or 'pending-first-check') when provider.configured is true but
provider.lastCheck is falsy; keep the existing logic for success,
consecutiveFailures, and degraded unchanged and ensure the chosen new status is
added to the HealthStatus union/handling wherever statuses are rendered.
- Around line 282-283: Import Sentry from '@sentry/nextjs' and, inside the
health-check mutation catch block in useTriggerHealthCheck (the code path
handling the mutation failure) call Sentry.captureException(error) and add the
environment tag using process.env.NEXT_PUBLIC_SENTRY_ENVIRONMENT; likewise, in
the status-load error branch where useApiHealthStatus returns an error (the
UI/toast-only path handling providers/isLoading/error), call
Sentry.captureException(error) and set the same environment tag so both mutation
failures and query errors are reported to Sentry with the
NEXT_PUBLIC_SENTRY_ENVIRONMENT context.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7dc97090-5bdc-48d8-8cb4-8deb9d31d56f

📥 Commits

Reviewing files that changed from the base of the PR and between 54ffebb and 9254bae.

📒 Files selected for processing (1)
  • apps/admin/src/app/settings/api-credentials/page.tsx

Comment on lines +28 to +33
function getStatus(provider: ProviderStatus): HealthStatus {
if (!provider.configured) return 'not-configured'
if (!provider.lastCheck) return 'not-configured'
if (provider.lastCheck.success) return 'healthy'
if (provider.consecutiveFailures >= 2) return 'down'
return 'degraded'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Separate “awaiting first check” from “Not Configured”.

Line 30 maps configured && !lastCheck to not-configured, so a newly configured provider renders as gray/“Not Configured” until its first run completes. That also makes the card contradict the summary bar, which still counts the provider as configured.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/app/settings/api-credentials/page.tsx` around lines 28 - 33,
The current getStatus function conflates unconfigured providers and
newly-configured ones because it returns 'not-configured' when configured &&
!lastCheck; change getStatus (function getStatus(provider: ProviderStatus):
HealthStatus) so that it returns 'not-configured' only when provider.configured
is false, and returns a distinct status (e.g., 'awaiting-check' or
'pending-first-check') when provider.configured is true but provider.lastCheck
is falsy; keep the existing logic for success, consecutiveFailures, and degraded
unchanged and ensure the chosen new status is added to the HealthStatus
union/handling wherever statuses are rendered.

Comment on lines +148 to +152
<CollapsibleTrigger asChild>
<button
className="text-muted-foreground hover:text-foreground transition-colors flex-shrink-0"
aria-label={open ? 'Collapse' : 'Expand'}
>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the expand toggle label provider-specific.

Every card exposes the same “Expand” / “Collapse” accessible name, so screen-reader users can’t tell which provider’s history they are toggling from the tab order. Including provider.name in the label would remove that ambiguity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/app/settings/api-credentials/page.tsx` around lines 148 - 152,
The accessible label for the expand/collapse button is generic; update the
aria-label in the button inside the CollapsibleTrigger so it includes the
provider identity (e.g., use provider.name) to disambiguate for screen readers —
change the aria-label to include provider.name (for example: `${open ?
'Collapse' : 'Expand'} ${provider.name} history`) in the button rendered by the
CollapsibleTrigger.

Comment on lines +282 to +283
const { data: providers, isLoading, error } = useApiHealthStatus()
const triggerCheck = useTriggerHealthCheck()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

MONITORING_DOC="$(fd 'MONITORING\.md$' docs | head -n1)"
HOOK_FILE="$(fd 'use-api-health\.ts$' apps/admin/src/hooks | head -n1)"

if [ -z "$MONITORING_DOC" ] || [ -z "$HOOK_FILE" ]; then
  echo "Required monitoring context not found" >&2
  exit 1
fi

sed -n '1,220p' "$MONITORING_DOC"
sed -n '1,260p' "$HOOK_FILE"
rg -n -C2 'captureException|withScope|setTag|NEXT_PUBLIC_SENTRY_ENVIRONMENT|@sentry/nextjs|useApiHealthStatus|useTriggerHealthCheck' apps/admin/src "$MONITORING_DOC" "$HOOK_FILE"

Repository: Systemsaholic/tailfire

Length of output: 17669


🏁 Script executed:

cat -n apps/admin/src/app/settings/api-credentials/page.tsx | head -350 | tail -100

Repository: Systemsaholic/tailfire

Length of output: 3977


🏁 Script executed:

# Check for existing Sentry error capture patterns in the admin app
rg -n 'captureException|withScope|setTag' apps/admin/src --type ts --type tsx -B2 -A2

Repository: Systemsaholic/tailfire

Length of output: 93


🏁 Script executed:

rg -n 'captureException|withScope|setTag' apps/admin/src -B2 -A2

Repository: Systemsaholic/tailfire

Length of output: 2568


🏁 Script executed:

# Search for Sentry usage patterns and imports
rg -n '@sentry' apps/admin/src
rg -n 'Sentry\.' apps/admin/src

Repository: Systemsaholic/tailfire

Length of output: 388


🏁 Script executed:

# Look at instrumentation.ts to understand the Sentry setup
cat -n apps/admin/src/instrumentation.ts

# Check sentry config files
ls -la apps/admin/sentry*.config.ts
cat -n apps/admin/sentry.client.config.ts | head -50

Repository: Systemsaholic/tailfire

Length of output: 1353


🏁 Script executed:

# Check if there's a Sentry utility or hook for error capturing
find apps/admin/src -name '*.ts' -o -name '*.tsx' | xargs rg -l 'Sentry' | head -20

Repository: Systemsaholic/tailfire

Length of output: 164


🏁 Script executed:

# Check middleware for any Sentry usage patterns
cat -n apps/admin/src/middleware.ts | head -50

# Let's also check the exact implementation of the error handling in the page
sed -n '281,310p' apps/admin/src/app/settings/api-credentials/page.tsx
sed -n '337,343p' apps/admin/src/app/settings/api-credentials/page.tsx

# Check if there's any Sentry import in the page file
rg -n 'import.*Sentry\|from.*@sentry' apps/admin/src/app/settings/api-credentials/page.tsx

Repository: Systemsaholic/tailfire

Length of output: 3127


Add Sentry error capture to health status mutations and query failures.

The mutation catch block (lines 295–300) and the status-load error path (lines 338–341) only show toast/UI feedback without reporting to Sentry. Import @sentry/nextjs and call Sentry.captureException(error) in both paths, tagging each with NEXT_PUBLIC_SENTRY_ENVIRONMENT so dashboard outages are visible across sessions and environments.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/admin/src/app/settings/api-credentials/page.tsx` around lines 282 - 283,
Import Sentry from '@sentry/nextjs' and, inside the health-check mutation catch
block in useTriggerHealthCheck (the code path handling the mutation failure)
call Sentry.captureException(error) and add the environment tag using
process.env.NEXT_PUBLIC_SENTRY_ENVIRONMENT; likewise, in the status-load error
branch where useApiHealthStatus returns an error (the UI/toast-only path
handling providers/isLoading/error), call Sentry.captureException(error) and set
the same environment tag so both mutation failures and query errors are reported
to Sentry with the NEXT_PUBLIC_SENTRY_ENVIRONMENT context.

Systemsaholic and others added 3 commits March 21, 2026 11:09
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tialResolverService

AerodataboxController injects CredentialResolverService (for Amadeus
airport keyword search), but ApiCredentialsModule is not @global() so
it needs explicit import. Crashes on Railway with:
"Nest can't resolve dependencies of the AerodataboxController"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/superpowers/plans/2026-03-21-add-trip-to-group.md`:
- Around line 22-63: Add a mutual-exclusion check so requests with both
tripGroupId and ungrouped are rejected with HTTP 400: in the code path that
processes TripFilterDto (e.g., in trips.service.ts before the existing
tripGroupId/ungrouped filter logic), add an explicit check if
(filters.tripGroupId && filters.ungrouped) throw new
BadRequestException('tripGroupId and ungrouped cannot both be set'); and import
BadRequestException from `@nestjs/common`; alternatively you can add an equivalent
class-validator rule on TripFilterDto (trip-filter.dto.ts / TripFilterDto) that
fails validation when both fields are present, but ensure the check returns 400.
- Around line 549-557: handleTripCreated currently only invalidates
['tripGroups']; update it to invalidate the four keys required by the design
spec: the group-scoped trip caches and the global trips list. Specifically, call
queryClient.invalidateQueries for tripGroupKeys.trips(groupId),
tripGroupKeys.summary(groupId), tripGroupKeys.travelers(groupId), and
tripKeys.all; if the factory helpers (tripGroupKeys.trips/summary/travelers)
don't exist yet, construct the keys as arrays (e.g. [...tripGroupKeys.all,
'trips', groupId], [...tripGroupKeys.all, 'summary', groupId],
[...tripGroupKeys.all, 'travelers', groupId]) and invalidate those plus
tripKeys.all, then keep the toast behavior unchanged.

In `@docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md`:
- Around line 79-86: Specify deterministic behavior when both tripGroupId and
ungrouped are provided: update the spec to require rejecting requests that
include both a non-null tripGroupId and ungrouped=true with a 400 Bad Request
and a clear error message. Reference TripFilterDto (add validation rule), the
API DTO apps/api/src/trips/dto/trip-filter.dto.ts (validate and throw/bubble
BadRequest when tripGroupId && ungrouped === true), and trips.service.ts
findAll() (ensure it does not silently prefer one filter; rely on DTO
validation), and note that apps/admin/src/hooks/use-trips.ts should avoid
sending both parameters or handle the 400 by surfacing the error to the UI.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 486d4622-cfa9-4fc2-8566-46d287742129

📥 Commits

Reviewing files that changed from the base of the PR and between 9254bae and b9f9ae5.

📒 Files selected for processing (4)
  • apps/admin/src/app/settings/api-credentials/page.tsx
  • apps/api/src/external-apis/providers/aerodatabox/aerodatabox.module.ts
  • docs/superpowers/plans/2026-03-21-add-trip-to-group.md
  • docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/admin/src/app/settings/api-credentials/page.tsx

Comment on lines +22 to +63
- [ ] **Step 1: Add `ungrouped` to shared TripFilterDto**

In `packages/shared-types/src/api/trips.types.ts`, add after `tripGroupId?: string` (line 273):

```typescript
ungrouped?: boolean // Filter to trips not in any group
```

- [ ] **Step 2: Add `ungrouped` to API TripFilterDto**

In `apps/api/src/trips/dto/trip-filter.dto.ts`, add after the `tripGroupId` field (after line 84):

```typescript
@IsOptional()
@Transform(({ value }) => value === 'true' || value === true)
@IsBoolean()
ungrouped?: boolean
```

- [ ] **Step 3: Apply `ungrouped` filter in trips.service.ts**

In `apps/api/src/trips/trips.service.ts`, add after the `tripGroupId` filter block (after line 340):

```typescript
// Ungrouped filter — trips not in any group
if (filters.ungrouped) {
conditions.push(isNull(this.db.schema.trips.tripGroupId))
}
```

Ensure `isNull` is imported from `drizzle-orm` (check existing imports at the top of the file).

- [ ] **Step 4: Pass `ungrouped` through in useTrips hook**

In `apps/admin/src/hooks/use-trips.ts`, add `ungrouped` to the `buildQueryString` call inside `useTrips()` (around line 43-54):

```typescript
const query = buildQueryString({
// ... existing fields ...
ungrouped: filters.ungrouped,
})
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add explicit validation for mutually exclusive filters in Task 1.

The plan currently allows implementing tripGroupId and ungrouped independently, but does not require rejecting conflicting input. Please add a DTO/service validation step that returns 400 when both are provided.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/superpowers/plans/2026-03-21-add-trip-to-group.md` around lines 22 - 63,
Add a mutual-exclusion check so requests with both tripGroupId and ungrouped are
rejected with HTTP 400: in the code path that processes TripFilterDto (e.g., in
trips.service.ts before the existing tripGroupId/ungrouped filter logic), add an
explicit check if (filters.tripGroupId && filters.ungrouped) throw new
BadRequestException('tripGroupId and ungrouped cannot both be set'); and import
BadRequestException from `@nestjs/common`; alternatively you can add an equivalent
class-validator rule on TripFilterDto (trip-filter.dto.ts / TripFilterDto) that
fails validation when both fields are present, but ensure the check returns 400.

Comment on lines +549 to +557
```typescript
const handleTripCreated = (newTrip: TripResponseDto) => {
queryClient.invalidateQueries({ queryKey: ['tripGroups'] })
toast({
title: 'Trip created',
description: `${newTrip.name} has been added to this group.`,
})
}
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify query-key factories and current invalidation coverage used by trip/group flows.

set -euo pipefail

echo "== tripGroupKeys definitions =="
rg -n -C3 "tripGroupKeys" apps/admin/src --type ts --type tsx

echo
echo "== tripKeys definitions =="
rg -n -C3 "tripKeys" apps/admin/src --type ts --type tsx

echo
echo "== Invalidation calls in useAddTripsToGroup and related hooks =="
rg -n -C3 "invalidateQueries\\(" apps/admin/src/hooks/use-trips.ts --type ts

echo
echo "== Group page invalidation usage =="
rg -n -C3 "invalidateQueries\\(" apps/admin/src/app/trips/groups/[groupId]/page.tsx --type tsx

Repository: Systemsaholic/tailfire

Length of output: 142


🏁 Script executed:

#!/bin/bash
# Verify query-key factories and current invalidation coverage used by trip/group flows.

set -euo pipefail

echo "== Search for tripGroupKeys definitions =="
rg -n -C3 "tripGroupKeys" apps/admin/src

echo
echo "== Search for tripKeys definitions =="
rg -n -C3 "tripKeys" apps/admin/src

echo
echo "== Search for handleTripCreated function =="
rg -n -C5 "handleTripCreated" apps/admin/src

echo
echo "== Search for invalidateQueries in hooks =="
rg -n -B2 -A5 "invalidateQueries" apps/admin/src/hooks

echo
echo "== Listing files in hooks directory =="
ls -la apps/admin/src/hooks/

Repository: Systemsaholic/tailfire

Length of output: 28459


🏁 Script executed:

#!/bin/bash

# Search for handleTripCreated function implementation
echo "== Search for handleTripCreated in entire codebase =="
rg -n "handleTripCreated" apps/

echo
echo "== Search for useAddTripsToGroup context =="
rg -n -B5 -A10 "useAddTripsToGroup" apps/admin/src

echo
echo "== Check the doc file mentioned in review =="
head -n 560 docs/superpowers/plans/2026-03-21-add-trip-to-group.md | tail -n 30

Repository: Systemsaholic/tailfire

Length of output: 2786


🏁 Script executed:

#!/bin/bash

# Compare useCreateTrip vs useAddTripsToGroup invalidation patterns
echo "== useCreateTrip invalidation pattern =="
rg -n -A8 "export function useCreateTrip" apps/admin/src/hooks/use-trips.ts

echo
echo "== Compare with useAddTripsToGroup pattern (already shown) =="
echo "useAddTripsToGroup invalidates: tripGroupKeys.all + tripKeys.lists()"

echo
echo "== Check useTripsByGroup query key pattern =="
rg -n -A5 "export function useTripsByGroup" apps/admin/src/hooks/use-trips.ts

echo
echo "== Check doc context for design spec mentioned =="
grep -n "trips.*summary.*travelers" docs/superpowers/plans/2026-03-21-add-trip-to-group.md | head -5

echo
echo "== Look for query key spec in design doc =="
grep -n -i "design spec\|explicit keys\|invalidat" docs/superpowers/plans/2026-03-21-add-trip-to-group.md | head -10

Repository: Systemsaholic/tailfire

Length of output: 1364


🏁 Script executed:

#!/bin/bash

# Search the entire doc file for design spec or cache invalidation requirements
echo "== Search for design spec in doc =="
grep -n -i "design\|spec\|cache\|invalidat" docs/superpowers/plans/2026-03-21-add-trip-to-group.md

echo
echo "== Look for query key or cache key definitions in doc =="
grep -n -B3 -A3 "trips.*summary.*travelers\|query.*key\|cache.*key" docs/superpowers/plans/2026-03-21-add-trip-to-group.md | head -40

echo
echo "== Check lines around 549 for context =="
sed -n '530,570p' docs/superpowers/plans/2026-03-21-add-trip-to-group.md

Repository: Systemsaholic/tailfire

Length of output: 1678


🏁 Script executed:

#!/bin/bash

echo "== Check if design spec exists =="
ls -la docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md

echo
echo "== Read design spec for cache/invalidation requirements =="
cat docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md

Repository: Systemsaholic/tailfire

Length of output: 6108


🏁 Script executed:

#!/bin/bash

echo "== Full tripGroupKeys definition =="
rg -n -A20 "^const tripGroupKeys" apps/admin/src/hooks/use-trips.ts

echo
echo "== Check if tripGroupKeys.trips, summary, travelers methods exist =="
rg -n "tripGroupKeys\.(trips|summary|travelers)" apps/admin/src/hooks/use-trips.ts

echo
echo "== Verify the design spec explicitly requires these =="
grep -n "tripGroupKeys\.(trips|summary|travelers)" docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md

Repository: Systemsaholic/tailfire

Length of output: 836


handleTripCreated invalidation scope is incomplete for the design spec.

The design spec (§2: "Query Invalidation") explicitly requires invalidating four specific query keys after trip creation: tripGroupKeys.trips(groupId), tripGroupKeys.summary(groupId), tripGroupKeys.travelers(groupId), and tripKeys.all. The current handleTripCreated only invalidates ['tripGroups'], missing the global trips list and explicit group-scoped caches. Use the spec's invalidation list directly, or construct the keys as [...tripGroupKeys.all, 'trips', groupId] and similar for summary and travelers (the factory methods referenced in the spec don't yet exist in tripGroupKeys).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/superpowers/plans/2026-03-21-add-trip-to-group.md` around lines 549 -
557, handleTripCreated currently only invalidates ['tripGroups']; update it to
invalidate the four keys required by the design spec: the group-scoped trip
caches and the global trips list. Specifically, call
queryClient.invalidateQueries for tripGroupKeys.trips(groupId),
tripGroupKeys.summary(groupId), tripGroupKeys.travelers(groupId), and
tripKeys.all; if the factory helpers (tripGroupKeys.trips/summary/travelers)
don't exist yet, construct the keys as arrays (e.g. [...tripGroupKeys.all,
'trips', groupId], [...tripGroupKeys.all, 'summary', groupId],
[...tripGroupKeys.all, 'travelers', groupId]) and invalidate those plus
tripKeys.all, then keep the toast behavior unchanged.

Comment on lines +79 to +86
## 4. Backend: Add `ungrouped` Filter

**Files:**
- `packages/shared-types/src/api/trips.types.ts` — add `ungrouped?: boolean` to `TripFilterDto`
- `apps/api/src/trips/dto/trip-filter.dto.ts` — add `@IsOptional() @IsBoolean() ungrouped?: boolean`
- `apps/api/src/trips/trips.service.ts` — in `findAll()`, when `filters.ungrouped === true`, add `WHERE trip_group_id IS NULL`
- `apps/admin/src/hooks/use-trips.ts` — pass `ungrouped` through to the API query params

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Define precedence for tripGroupId + ungrouped to avoid contradictory filters.

The spec adds ungrouped but does not define behavior when callers send both tripGroupId and ungrouped=true. Please explicitly specify either (a) reject with 400, or (b) precedence rules, so backend + admin hooks stay deterministic.

Suggested spec patch
 ## 4. Backend: Add `ungrouped` Filter

 **Files:**
 - `packages/shared-types/src/api/trips.types.ts` — add `ungrouped?: boolean` to `TripFilterDto`
 - `apps/api/src/trips/dto/trip-filter.dto.ts` — add `@IsOptional() `@IsBoolean`() ungrouped?: boolean`
 - `apps/api/src/trips/trips.service.ts` — in `findAll()`, when `filters.ungrouped === true`, add `WHERE trip_group_id IS NULL`
 - `apps/admin/src/hooks/use-trips.ts` — pass `ungrouped` through to the API query params
+
+**Conflict rule:** `tripGroupId` and `ungrouped=true` are mutually exclusive.
+- If both are provided, return `400 Bad Request` with a validation error.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## 4. Backend: Add `ungrouped` Filter
**Files:**
- `packages/shared-types/src/api/trips.types.ts` — add `ungrouped?: boolean` to `TripFilterDto`
- `apps/api/src/trips/dto/trip-filter.dto.ts` — add `@IsOptional() @IsBoolean() ungrouped?: boolean`
- `apps/api/src/trips/trips.service.ts` — in `findAll()`, when `filters.ungrouped === true`, add `WHERE trip_group_id IS NULL`
- `apps/admin/src/hooks/use-trips.ts` — pass `ungrouped` through to the API query params
## 4. Backend: Add `ungrouped` Filter
**Files:**
- `packages/shared-types/src/api/trips.types.ts` — add `ungrouped?: boolean` to `TripFilterDto`
- `apps/api/src/trips/dto/trip-filter.dto.ts` — add `@IsOptional() `@IsBoolean`() ungrouped?: boolean`
- `apps/api/src/trips/trips.service.ts` — in `findAll()`, when `filters.ungrouped === true`, add `WHERE trip_group_id IS NULL`
- `apps/admin/src/hooks/use-trips.ts` — pass `ungrouped` through to the API query params
**Conflict rule:** `tripGroupId` and `ungrouped=true` are mutually exclusive.
- If both are provided, return `400 Bad Request` with a validation error.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/superpowers/specs/2026-03-21-add-trip-to-group-design.md` around lines
79 - 86, Specify deterministic behavior when both tripGroupId and ungrouped are
provided: update the spec to require rejecting requests that include both a
non-null tripGroupId and ungrouped=true with a 400 Bad Request and a clear error
message. Reference TripFilterDto (add validation rule), the API DTO
apps/api/src/trips/dto/trip-filter.dto.ts (validate and throw/bubble BadRequest
when tripGroupId && ungrouped === true), and trips.service.ts findAll() (ensure
it does not silently prefer one filter; rely on DTO validation), and note that
apps/admin/src/hooks/use-trips.ts should avoid sending both parameters or handle
the 400 by surfacing the error to the UI.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/src/api-health/api-health.service.ts`:
- Around line 462-478: The checkCloudflareR2 method currently creates an
S3Client and calls client.send(new HeadBucketCommand(...)) but never destroys
the client if send throws; wrap the send call in a try-finally block so that
client.destroy() is always called (in finally) after using S3Client and
HeadBucketCommand, rethrow or let the original error propagate after cleanup;
reference the checkCloudflareR2 function and the S3Client/HeadBucketCommand
usage to locate where to add the try-finally.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f8764025-478b-48d3-a419-a18d44591195

📥 Commits

Reviewing files that changed from the base of the PR and between b9f9ae5 and 016a7f5.

📒 Files selected for processing (1)
  • apps/api/src/api-health/api-health.service.ts

Comment on lines +462 to +478
private async checkCloudflareR2(): Promise<void> {
const accountId = this.configService.get<string>('CLOUDFLARE_R2_ACCOUNT_ID')!
const accessKeyId = this.configService.get<string>('CLOUDFLARE_R2_ACCESS_KEY_ID')!
const secretAccessKey = this.configService.get<string>('CLOUDFLARE_R2_SECRET_ACCESS_KEY')!
const bucketName = this.configService.get<string>('CLOUDFLARE_R2_BUCKET_NAME')!

// Use AWS SDK S3 HeadBucket — import dynamically to avoid top-level import
const { S3Client, HeadBucketCommand } = await import('@aws-sdk/client-s3')
const client = new S3Client({
region: 'auto',
endpoint: `https://${accountId}.r2.cloudflarestorage.com`,
credentials: { accessKeyId, secretAccessKey },
})

await client.send(new HeadBucketCommand({ Bucket: bucketName }))
client.destroy()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

S3Client resource leak if HeadBucketCommand fails.

If client.send() throws, client.destroy() is never called, leaving the client's connections unreleased. Wrap in try-finally to ensure cleanup.

🐛 Proposed fix
   private async checkCloudflareR2(): Promise<void> {
     const accountId = this.configService.get<string>('CLOUDFLARE_R2_ACCOUNT_ID')!
     const accessKeyId = this.configService.get<string>('CLOUDFLARE_R2_ACCESS_KEY_ID')!
     const secretAccessKey = this.configService.get<string>('CLOUDFLARE_R2_SECRET_ACCESS_KEY')!
     const bucketName = this.configService.get<string>('CLOUDFLARE_R2_BUCKET_NAME')!
 
     // Use AWS SDK S3 HeadBucket — import dynamically to avoid top-level import
     const { S3Client, HeadBucketCommand } = await import('@aws-sdk/client-s3')
     const client = new S3Client({
       region: 'auto',
       endpoint: `https://${accountId}.r2.cloudflarestorage.com`,
       credentials: { accessKeyId, secretAccessKey },
     })
 
-    await client.send(new HeadBucketCommand({ Bucket: bucketName }))
-    client.destroy()
+    try {
+      await client.send(new HeadBucketCommand({ Bucket: bucketName }))
+    } finally {
+      client.destroy()
+    }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/src/api-health/api-health.service.ts` around lines 462 - 478, The
checkCloudflareR2 method currently creates an S3Client and calls client.send(new
HeadBucketCommand(...)) but never destroys the client if send throws; wrap the
send call in a try-finally block so that client.destroy() is always called (in
finally) after using S3Client and HeadBucketCommand, rethrow or let the original
error propagate after cleanup; reference the checkCloudflareR2 function and the
S3Client/HeadBucketCommand usage to locate where to add the try-finally.

@Systemsaholic Systemsaholic merged commit dc516fd into main Mar 21, 2026
4 checks passed
@Systemsaholic Systemsaholic deleted the feature/api-health-dashboard branch March 23, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant