Skip to content

feat: maintenance observability — Monitor page (#37)#147

Merged
thebtf merged 4 commits into
mainfrom
feat/maintenance-observability
Apr 12, 2026
Merged

feat: maintenance observability — Monitor page (#37)#147
thebtf merged 4 commits into
mainfrom
feat/maintenance-observability

Conversation

@thebtf
Copy link
Copy Markdown
Owner

@thebtf thebtf commented Apr 12, 2026

Summary

  • Add real-time maintenance progress tracking with SSE events (maintenance_progress, maintenance_complete)
  • New REST endpoints: GET /api/maintenance/status (subtask list + progress), GET /api/maintenance/logs (ring buffer entries)
  • POST /api/maintenance/run returns 409 if already running (concurrent guard)
  • New Monitor page with two tabs: Maintenance (status panel + progress bar + subtask grid + log viewer) and Server Logs (port of existing LogsView)
  • Rename sidebar Logs → Monitor with /logs/monitor redirect
  • Plugin version bump 3.7.8 → 3.7.9

Closes #37

Backend changes

  • internal/maintenance/service.go: ProgressCallback, CompletionSummary types; emitProgress() wraps all 25 subtasks with started/completed/failed emissions; IsRunning(), CurrentSubtask(), CurrentProgress() methods; RunNow() returns bool (false if already running)
  • internal/worker/handlers_maintenance.go: handleMaintenanceStatus, handleMaintenanceLogs handlers
  • internal/worker/service.go: SSE callback wiring, new routes

Frontend changes

  • ui/src/utils/api.ts: MaintenanceStatus, MaintenanceSubtask, MaintenanceLogEntry types + fetch functions
  • ui/src/composables/useMaintenance.ts: SSE event watcher, auto-refresh polling, trigger action
  • ui/src/views/MonitorView.vue: two-tab layout with status panel, progress bar, subtask grid, log viewer
  • Router + sidebar: /monitor route, /logs redirect

Test plan

  • go build ./... clean
  • go test ./internal/maintenance/... -count=1 passes
  • npm --prefix ui run build clean
  • Open /#/monitor — see Maintenance tab with status panel
  • Click "Run Now" — progress bar updates, subtask grid shows green checkmarks
  • Switch to "Server Logs" tab — see real-time log stream
  • Visit /#/logs — redirects to /#/monitor

Summary by CodeRabbit

  • Новые возможности

    • Добавлена страница Monitor с дашбордом обслуживания и вкладкой Серверные логи.
    • Панель обслуживания: отображение текущей подзадачи, прогресса, прошлых и следующих запусков, кнопка «Run Now».
    • Встроенные потоки обновлений в реальном времени для прогресса и завершения работ; просмотр последних логов с поиском/фильтрацией и автообновлением.
  • Исправления

    • Блокировка одновременных запусков обслуживания; при попытке запуска во время выполнения возвращается конфликт.

thebtf added 3 commits April 13, 2026 00:47
… events

Wire progress tracking into all 25 maintenance subtasks with start/completed/failed
status emissions. Add ProgressCallback + CompletionSummary types.

New REST endpoints:
- GET /api/maintenance/status — current state with subtask list
- GET /api/maintenance/logs — ring buffer log entries filtered by component

SSE events broadcast via existing broadcaster:
- maintenance_progress: per-subtask state changes
- maintenance_complete: run summary with merged/archived/pruned counts

RunNow returns 409 Conflict if already running. Double-entry guard on
runMaintenance prevents concurrent execution.
New MonitorView combines two tabs:
- Maintenance: status panel with progress bar, subtask grid (25 tasks),
  Run Now button, auto-refresh toggle, and filterable maintenance logs
- Server Logs: full port of existing LogsView functionality

Rename sidebar "Logs" → "Monitor" with /logs → /monitor redirect for
backward compatibility.

New composable useMaintenance.ts watches SSE events for real-time
maintenance_progress and maintenance_complete updates.

New API types: MaintenanceStatus, MaintenanceSubtask, MaintenanceLogEntry
with fetchMaintenanceStatus and fetchMaintenanceLogs functions.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

Добавлена подсистема мониторинга обслуживания: колбэки прогресса/завершения в сервисе maintenance, новые REST-эндпойнты для статуса и логов, SSE-распространение событий, UI-страница Monitor с заменой Logs и клиентский composable для управления и просмотра.

Changes

Cohort / File(s) Summary
Core maintenance service
internal/maintenance/service.go
Добавлены экспортные типы ProgressCallback, CompletionSummary; поля OnProgress, OnComplete; состояние выполнения (maintenanceRunning, тек. подзадача и индекс); RunNow(ctx) bool предотвращает параллельные запуски; эмиссия прогресса/завершения; новые инспекционные методы IsRunning(), CurrentSubtask(), CurrentProgress(), SubtaskNames().
Worker HTTP handlers
internal/worker/handlers_maintenance.go
handleRunMaintenance теперь проверяет IsRunning() и возвращает HTTP 409 при конфликте; добавлены handleMaintenanceStatus и handleMaintenanceLogs (постраничные/лимитные логи).
Worker service wiring / SSE
internal/worker/service.go
Привязка OnProgress/OnComplete к SSE-рассылке (maintenance_progress, maintenance_complete); регистрация маршрутов /api/maintenance/status и /api/maintenance/logs.
UI router & sidebar
ui/src/router/index.ts, ui/src/components/layout/AppSidebar.vue
Добавлен маршрут /monitor к MonitorView.vue; маршрут /logs перенаправляет на /monitor; в сайдбаре заменён пункт logsmonitor.
Monitor view & composable
ui/src/views/MonitorView.vue, ui/src/composables/useMaintenance.ts
Новый MonitorView с вкладками Maintenance и Server Logs; composable useMaintenance() для загрузки статуса/логов, SSE-подписок, автorefresh и триггера запуска.
API types / client fetchers
ui/src/utils/api.ts
Добавлены типы MaintenanceSubtask, MaintenanceLastRun, MaintenanceStatus, MaintenanceLogEntry; функции fetchMaintenanceStatus() и fetchMaintenanceLogs().
Plugin manifest
plugin/engram/.claude-plugin/plugin.json
Повышена версия плагина 3.7.83.7.9.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant UI as Monitor View
    participant API as Worker API
    participant WorkerSvc as Worker Service
    participant MaintSvc as Maintenance Service
    participant SSE as SSE Broker

    User->>UI: Нажимает "Run Now"
    UI->>API: POST /api/maintenance/run
    API->>WorkerSvc: handleRunMaintenance
    alt maintenanceRunning == true
        WorkerSvc-->>UI: HTTP 409 Conflict
    else
        WorkerSvc->>MaintSvc: RunNow(ctx)
        activate MaintSvc
        MaintSvc->>MaintSvc: set maintenanceRunning=true
        loop по подзадачам
            MaintSvc->>MaintSvc: emitProgress(started)
            MaintSvc->>WorkerSvc: OnProgress callback
            WorkerSvc->>SSE: broadcast maintenance_progress
            SSE-->>UI: SSE событие
            UI->>UI: обновить статус/прогресс
            MaintSvc->>MaintSvc: выполнить подзадачу
            MaintSvc->>MaintSvc: emitProgress(completed/failed)
            MaintSvc->>WorkerSvc: OnProgress callback
            WorkerSvc->>SSE: broadcast maintenance_progress
            SSE-->>UI: SSE событие
        end
        MaintSvc->>WorkerSvc: OnComplete(CompletionSummary)
        WorkerSvc->>SSE: broadcast maintenance_complete
        SSE-->>UI: SSE событие
        MaintSvc->>MaintSvc: set maintenanceRunning=false
        deactivate MaintSvc
        WorkerSvc-->>UI: HTTP 200 OK
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

🐰
Я норку покинул, в код заглянул,
Где прогресс бежит, как новый хвостик.
SSE шлёт мне свисток, я гремлю морковкой,
Пробежали таски — всё под контролем ловко,
Пусть монитор сияет — прыгаю радостно!

🚥 Pre-merge checks | ✅ 1 | ❌ 4

❌ Failed checks (4 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning PR title mentions Monitor page observability feature, but linked issue #37 is about fixing orphan vector purge with database migration. Update PR title to reflect the primary change: fix orphan vector purge or clarify which issue this PR addresses.
Linked Issues check ⚠️ Warning PR summary describes maintenance observability feature, but linked issue #37 requires correcting a database migration for vector purging. No database migration changes found in the provided summaries. Either add the required database migration changes to correct the vector purge query, or update linked issue reference to match the actual maintenance observability work.
Out of Scope Changes check ⚠️ Warning All changes in PR are related to maintenance observability (Monitor page, SSE events, status endpoints), which appears unrelated to the linked issue #37 about fixing orphan vector purge migrations. Clarify PR scope: either implement the vector purge migration fix or relink this PR to issues covering the observability feature work.
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/maintenance-observability

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a maintenance monitoring system, including a new 'Monitor' view in the UI, SSE-based progress tracking, and API endpoints for status and logs. The backend changes add progress callbacks and state tracking to the maintenance service. The review feedback identifies several opportunities to improve the robustness of this state tracking, such as preserving progress data after a run completes, ensuring the service context is used for lifecycle management, and refactoring the fragile manual task indexing to prevent synchronization issues between the execution loop and the subtask definitions.

Comment on lines +69 to +74
maintenanceRunning bool
currentSubtask string
currentSubtaskIndex int
subtaskTotal int
OnProgress ProgressCallback
OnComplete func(CompletionSummary)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Service struct should store the status of the current subtask to allow the handleMaintenanceStatus API to report accurate state (e.g., distinguishing between a task that is still running and one that has completed or failed).

	maintenanceRunning         bool
	currentSubtask             string
	currentSubtaskIndex        int
	subtaskTotal               int
	currentSubtaskStatus       string
	OnProgress                 ProgressCallback
	OnComplete                 func(CompletionSummary)

Comment on lines +297 to +303
defer func() {
s.mu.Lock()
s.maintenanceRunning = false
s.currentSubtask = ""
s.currentSubtaskIndex = 0
s.mu.Unlock()
}()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Resetting currentSubtaskIndex and currentSubtask to zero/empty in the defer block causes the Monitor page to lose all progress information immediately after a maintenance run completes. By preserving these values, the UI can continue to show the results of the last run while the service is idle.

	defer func() {
		s.mu.Lock()
		s.maintenanceRunning = false
		s.mu.Unlock()
	}()


// Task 1: Clean up old observations by age
taskIdx++
s.emitProgress(subtasks[taskIdx-1], taskIdx, total, "started", "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The manual management of taskIdx and the dependency on the hardcoded list in SubtaskNames() is fragile and prone to maintainability issues. If a task is added or removed in runMaintenance without perfectly synchronizing SubtaskNames, the progress reporting will be incorrect or the goroutine may panic due to an out-of-bounds access. Consider refactoring this to use a slice of task definitions (name + function) to drive the execution loop.

Comment thread internal/maintenance/service.go Outdated
Comment on lines +1106 to +1110
func (s *Service) CurrentProgress() (currentIndex, total int) {
s.mu.Lock()
defer s.mu.Unlock()
return s.currentSubtaskIndex, s.subtaskTotal
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update this getter to return the current subtask status so the API handler can report it correctly.

Suggested change
func (s *Service) CurrentProgress() (currentIndex, total int) {
s.mu.Lock()
defer s.mu.Unlock()
return s.currentSubtaskIndex, s.subtaskTotal
}
// CurrentProgress returns the 1-based index of the current subtask, the total count, and the current status.
func (s *Service) CurrentProgress() (currentIndex, total int, status string) {
s.mu.Lock()
defer s.mu.Unlock()
return s.currentSubtaskIndex, s.subtaskTotal, s.currentSubtaskStatus
}

Comment on lines +1126 to +1133
func (s *Service) emitProgress(subtask string, index, total int, status, message string) {
s.mu.Lock()
if status == "started" {
s.currentSubtask = subtask
s.currentSubtaskIndex = index
s.subtaskTotal = total
}
s.mu.Unlock()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The service state should be updated on every progress event, not just when a task starts. This ensures that the CurrentProgress getter returns the most up-to-date information, including whether the current task has completed or failed.

Suggested change
func (s *Service) emitProgress(subtask string, index, total int, status, message string) {
s.mu.Lock()
if status == "started" {
s.currentSubtask = subtask
s.currentSubtaskIndex = index
s.subtaskTotal = total
}
s.mu.Unlock()
// emitProgress updates the current subtask state and fires the OnProgress callback.
func (s *Service) emitProgress(subtask string, index, total int, status, message string) {
s.mu.Lock()
s.currentSubtask = subtask
s.currentSubtaskIndex = index
s.subtaskTotal = total
s.currentSubtaskStatus = status
s.mu.Unlock()

Comment thread internal/worker/handlers_maintenance.go Outdated

// Use background context: the request context is cancelled after the
// response is sent, which would prematurely abort the background job.
s.maintenanceService.RunNow(context.Background())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the service context s.ctx instead of context.Background() to ensure that maintenance runs triggered via the API are gracefully cancelled when the worker service shuts down.

Suggested change
s.maintenanceService.RunNow(context.Background())
s.maintenanceService.RunNow(s.ctx)

Comment thread internal/worker/handlers_maintenance.go Outdated
stats := s.maintenanceService.Stats()
isRunning := s.maintenanceService.IsRunning()
currentSubtask := s.maintenanceService.CurrentSubtask()
currentIdx, total := s.maintenanceService.CurrentProgress()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the call to CurrentProgress to retrieve the current subtask status.

Suggested change
currentIdx, total := s.maintenanceService.CurrentProgress()
currentIdx, total, currentStatus := s.maintenanceService.CurrentProgress()

Comment on lines +530 to +536
if isRunning {
if i < currentIdx-1 {
status = "completed"
} else if i == currentIdx-1 {
status = "running"
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The status reporting logic should use the actual status of the current subtask and handle the idle state (after a run completes) to show the results of the last run instead of resetting everything to 'pending'.

		if isRunning {
			if i < currentIdx-1 {
				status = "completed"
			} else if i == currentIdx-1 {
				status = currentStatus
				if status == "started" {
					status = "running"
				}
			}
		} else if currentIdx > 0 {
			status = "completed"
		}

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Apr 12, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/maintenance/service.go`:
- Around line 1080-1088: RunNow currently unlocks s.mu before marking the run as
active, allowing two concurrent callers to both see maintenanceRunning==false;
change RunNow so it takes s.mu, checks s.maintenanceRunning, and if false
immediately sets s.maintenanceRunning = true while still holding the lock, then
unlocks and starts the goroutine; keep using s.runMaintenance(ctx) but ensure
the goroutine resets s.maintenanceRunning = false under s.mu when finished. This
makes the check-and-set atomic using the existing s.mu, and references
Service.RunNow, s.mu, s.maintenanceRunning and s.runMaintenance.
- Around line 333-335: The emitProgress calls that report skipped branches are
incorrectly using status "completed"; update all s.emitProgress invocations in
this method (e.g., the call using subtasks[taskIdx-1], taskIdx, total with
message "skipped (retention disabled)" and the similar calls at the other
occurrences referenced) to pass status "skipped" instead of "completed" so
SSE/API consumers can distinguish skipped vs truly completed subtasks; search
for s.emitProgress(...) in this method and replace the status string "completed"
with "skipped" in the three spots indicated (around the shown diff and the
blocks at the other two locations).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7628689a-f1d4-46ed-9cef-db5e1c9f5f2a

📥 Commits

Reviewing files that changed from the base of the PR and between 5f29183 and 888ff20.

📒 Files selected for processing (9)
  • internal/maintenance/service.go
  • internal/worker/handlers_maintenance.go
  • internal/worker/service.go
  • plugin/engram/.claude-plugin/plugin.json
  • ui/src/components/layout/AppSidebar.vue
  • ui/src/composables/useMaintenance.ts
  • ui/src/router/index.ts
  • ui/src/utils/api.ts
  • ui/src/views/MonitorView.vue

Comment thread internal/maintenance/service.go
Comment thread internal/maintenance/service.go
…acking, skipped state

- Add currentSubtaskStatus field for accurate progress reporting
- Make RunNow atomic (check-and-set under single lock) — fixes race condition
  where concurrent calls could both pass the maintenanceRunning guard
- emitProgress now updates all state fields on every call, not just on "started"
- CurrentProgress returns 3 values (index, total, status) for precise API reporting
- Preserve subtask state after run completes (don't reset in defer) so Monitor
  page shows last run results while idle
- Change 21 "skipped" task emissions from status "completed" to "skipped"
- Use s.ctx instead of context.Background() for API-triggered runs
- Status handler shows "completed" for all tasks when idle after a run
@thebtf thebtf merged commit de49eec into main Apr 12, 2026
1 of 2 checks passed
@thebtf thebtf deleted the feat/maintenance-observability branch April 12, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant