feat: maintenance observability — Monitor page (#37)#147
Conversation
… events Wire progress tracking into all 25 maintenance subtasks with start/completed/failed status emissions. Add ProgressCallback + CompletionSummary types. New REST endpoints: - GET /api/maintenance/status — current state with subtask list - GET /api/maintenance/logs — ring buffer log entries filtered by component SSE events broadcast via existing broadcaster: - maintenance_progress: per-subtask state changes - maintenance_complete: run summary with merged/archived/pruned counts RunNow returns 409 Conflict if already running. Double-entry guard on runMaintenance prevents concurrent execution.
New MonitorView combines two tabs: - Maintenance: status panel with progress bar, subtask grid (25 tasks), Run Now button, auto-refresh toggle, and filterable maintenance logs - Server Logs: full port of existing LogsView functionality Rename sidebar "Logs" → "Monitor" with /logs → /monitor redirect for backward compatibility. New composable useMaintenance.ts watches SSE events for real-time maintenance_progress and maintenance_complete updates. New API types: MaintenanceStatus, MaintenanceSubtask, MaintenanceLogEntry with fetchMaintenanceStatus and fetchMaintenanceLogs functions.
|
Caution Review failedPull request was closed or merged during review WalkthroughДобавлена подсистема мониторинга обслуживания: колбэки прогресса/завершения в сервисе maintenance, новые REST-эндпойнты для статуса и логов, SSE-распространение событий, UI-страница Monitor с заменой Logs и клиентский composable для управления и просмотра. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant UI as Monitor View
participant API as Worker API
participant WorkerSvc as Worker Service
participant MaintSvc as Maintenance Service
participant SSE as SSE Broker
User->>UI: Нажимает "Run Now"
UI->>API: POST /api/maintenance/run
API->>WorkerSvc: handleRunMaintenance
alt maintenanceRunning == true
WorkerSvc-->>UI: HTTP 409 Conflict
else
WorkerSvc->>MaintSvc: RunNow(ctx)
activate MaintSvc
MaintSvc->>MaintSvc: set maintenanceRunning=true
loop по подзадачам
MaintSvc->>MaintSvc: emitProgress(started)
MaintSvc->>WorkerSvc: OnProgress callback
WorkerSvc->>SSE: broadcast maintenance_progress
SSE-->>UI: SSE событие
UI->>UI: обновить статус/прогресс
MaintSvc->>MaintSvc: выполнить подзадачу
MaintSvc->>MaintSvc: emitProgress(completed/failed)
MaintSvc->>WorkerSvc: OnProgress callback
WorkerSvc->>SSE: broadcast maintenance_progress
SSE-->>UI: SSE событие
end
MaintSvc->>WorkerSvc: OnComplete(CompletionSummary)
WorkerSvc->>SSE: broadcast maintenance_complete
SSE-->>UI: SSE событие
MaintSvc->>MaintSvc: set maintenanceRunning=false
deactivate MaintSvc
WorkerSvc-->>UI: HTTP 200 OK
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 1 | ❌ 4❌ Failed checks (4 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a maintenance monitoring system, including a new 'Monitor' view in the UI, SSE-based progress tracking, and API endpoints for status and logs. The backend changes add progress callbacks and state tracking to the maintenance service. The review feedback identifies several opportunities to improve the robustness of this state tracking, such as preserving progress data after a run completes, ensuring the service context is used for lifecycle management, and refactoring the fragile manual task indexing to prevent synchronization issues between the execution loop and the subtask definitions.
| maintenanceRunning bool | ||
| currentSubtask string | ||
| currentSubtaskIndex int | ||
| subtaskTotal int | ||
| OnProgress ProgressCallback | ||
| OnComplete func(CompletionSummary) |
There was a problem hiding this comment.
The Service struct should store the status of the current subtask to allow the handleMaintenanceStatus API to report accurate state (e.g., distinguishing between a task that is still running and one that has completed or failed).
maintenanceRunning bool
currentSubtask string
currentSubtaskIndex int
subtaskTotal int
currentSubtaskStatus string
OnProgress ProgressCallback
OnComplete func(CompletionSummary)| defer func() { | ||
| s.mu.Lock() | ||
| s.maintenanceRunning = false | ||
| s.currentSubtask = "" | ||
| s.currentSubtaskIndex = 0 | ||
| s.mu.Unlock() | ||
| }() |
There was a problem hiding this comment.
Resetting currentSubtaskIndex and currentSubtask to zero/empty in the defer block causes the Monitor page to lose all progress information immediately after a maintenance run completes. By preserving these values, the UI can continue to show the results of the last run while the service is idle.
defer func() {
s.mu.Lock()
s.maintenanceRunning = false
s.mu.Unlock()
}()|
|
||
| // Task 1: Clean up old observations by age | ||
| taskIdx++ | ||
| s.emitProgress(subtasks[taskIdx-1], taskIdx, total, "started", "") |
There was a problem hiding this comment.
The manual management of taskIdx and the dependency on the hardcoded list in SubtaskNames() is fragile and prone to maintainability issues. If a task is added or removed in runMaintenance without perfectly synchronizing SubtaskNames, the progress reporting will be incorrect or the goroutine may panic due to an out-of-bounds access. Consider refactoring this to use a slice of task definitions (name + function) to drive the execution loop.
| func (s *Service) CurrentProgress() (currentIndex, total int) { | ||
| s.mu.Lock() | ||
| defer s.mu.Unlock() | ||
| return s.currentSubtaskIndex, s.subtaskTotal | ||
| } |
There was a problem hiding this comment.
Update this getter to return the current subtask status so the API handler can report it correctly.
| func (s *Service) CurrentProgress() (currentIndex, total int) { | |
| s.mu.Lock() | |
| defer s.mu.Unlock() | |
| return s.currentSubtaskIndex, s.subtaskTotal | |
| } | |
| // CurrentProgress returns the 1-based index of the current subtask, the total count, and the current status. | |
| func (s *Service) CurrentProgress() (currentIndex, total int, status string) { | |
| s.mu.Lock() | |
| defer s.mu.Unlock() | |
| return s.currentSubtaskIndex, s.subtaskTotal, s.currentSubtaskStatus | |
| } |
| func (s *Service) emitProgress(subtask string, index, total int, status, message string) { | ||
| s.mu.Lock() | ||
| if status == "started" { | ||
| s.currentSubtask = subtask | ||
| s.currentSubtaskIndex = index | ||
| s.subtaskTotal = total | ||
| } | ||
| s.mu.Unlock() |
There was a problem hiding this comment.
The service state should be updated on every progress event, not just when a task starts. This ensures that the CurrentProgress getter returns the most up-to-date information, including whether the current task has completed or failed.
| func (s *Service) emitProgress(subtask string, index, total int, status, message string) { | |
| s.mu.Lock() | |
| if status == "started" { | |
| s.currentSubtask = subtask | |
| s.currentSubtaskIndex = index | |
| s.subtaskTotal = total | |
| } | |
| s.mu.Unlock() | |
| // emitProgress updates the current subtask state and fires the OnProgress callback. | |
| func (s *Service) emitProgress(subtask string, index, total int, status, message string) { | |
| s.mu.Lock() | |
| s.currentSubtask = subtask | |
| s.currentSubtaskIndex = index | |
| s.subtaskTotal = total | |
| s.currentSubtaskStatus = status | |
| s.mu.Unlock() |
|
|
||
| // Use background context: the request context is cancelled after the | ||
| // response is sent, which would prematurely abort the background job. | ||
| s.maintenanceService.RunNow(context.Background()) |
There was a problem hiding this comment.
| stats := s.maintenanceService.Stats() | ||
| isRunning := s.maintenanceService.IsRunning() | ||
| currentSubtask := s.maintenanceService.CurrentSubtask() | ||
| currentIdx, total := s.maintenanceService.CurrentProgress() |
| if isRunning { | ||
| if i < currentIdx-1 { | ||
| status = "completed" | ||
| } else if i == currentIdx-1 { | ||
| status = "running" | ||
| } | ||
| } |
There was a problem hiding this comment.
The status reporting logic should use the actual status of the current subtask and handle the idle state (after a run completes) to show the results of the last run instead of resetting everything to 'pending'.
if isRunning {
if i < currentIdx-1 {
status = "completed"
} else if i == currentIdx-1 {
status = currentStatus
if status == "started" {
status = "running"
}
}
} else if currentIdx > 0 {
status = "completed"
}|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/maintenance/service.go`:
- Around line 1080-1088: RunNow currently unlocks s.mu before marking the run as
active, allowing two concurrent callers to both see maintenanceRunning==false;
change RunNow so it takes s.mu, checks s.maintenanceRunning, and if false
immediately sets s.maintenanceRunning = true while still holding the lock, then
unlocks and starts the goroutine; keep using s.runMaintenance(ctx) but ensure
the goroutine resets s.maintenanceRunning = false under s.mu when finished. This
makes the check-and-set atomic using the existing s.mu, and references
Service.RunNow, s.mu, s.maintenanceRunning and s.runMaintenance.
- Around line 333-335: The emitProgress calls that report skipped branches are
incorrectly using status "completed"; update all s.emitProgress invocations in
this method (e.g., the call using subtasks[taskIdx-1], taskIdx, total with
message "skipped (retention disabled)" and the similar calls at the other
occurrences referenced) to pass status "skipped" instead of "completed" so
SSE/API consumers can distinguish skipped vs truly completed subtasks; search
for s.emitProgress(...) in this method and replace the status string "completed"
with "skipped" in the three spots indicated (around the shown diff and the
blocks at the other two locations).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7628689a-f1d4-46ed-9cef-db5e1c9f5f2a
📒 Files selected for processing (9)
internal/maintenance/service.gointernal/worker/handlers_maintenance.gointernal/worker/service.goplugin/engram/.claude-plugin/plugin.jsonui/src/components/layout/AppSidebar.vueui/src/composables/useMaintenance.tsui/src/router/index.tsui/src/utils/api.tsui/src/views/MonitorView.vue
…acking, skipped state - Add currentSubtaskStatus field for accurate progress reporting - Make RunNow atomic (check-and-set under single lock) — fixes race condition where concurrent calls could both pass the maintenanceRunning guard - emitProgress now updates all state fields on every call, not just on "started" - CurrentProgress returns 3 values (index, total, status) for precise API reporting - Preserve subtask state after run completes (don't reset in defer) so Monitor page shows last run results while idle - Change 21 "skipped" task emissions from status "completed" to "skipped" - Use s.ctx instead of context.Background() for API-triggered runs - Status handler shows "completed" for all tasks when idle after a run
Summary
maintenance_progress,maintenance_complete)GET /api/maintenance/status(subtask list + progress),GET /api/maintenance/logs(ring buffer entries)POST /api/maintenance/runreturns 409 if already running (concurrent guard)/logs→/monitorredirectCloses #37
Backend changes
internal/maintenance/service.go:ProgressCallback,CompletionSummarytypes;emitProgress()wraps all 25 subtasks with started/completed/failed emissions;IsRunning(),CurrentSubtask(),CurrentProgress()methods;RunNow()returns bool (false if already running)internal/worker/handlers_maintenance.go:handleMaintenanceStatus,handleMaintenanceLogshandlersinternal/worker/service.go: SSE callback wiring, new routesFrontend changes
ui/src/utils/api.ts:MaintenanceStatus,MaintenanceSubtask,MaintenanceLogEntrytypes + fetch functionsui/src/composables/useMaintenance.ts: SSE event watcher, auto-refresh polling, trigger actionui/src/views/MonitorView.vue: two-tab layout with status panel, progress bar, subtask grid, log viewer/monitorroute,/logsredirectTest plan
go build ./...cleango test ./internal/maintenance/... -count=1passesnpm --prefix ui run buildclean/#/monitor— see Maintenance tab with status panel/#/logs— redirects to/#/monitorSummary by CodeRabbit
Новые возможности
Исправления