ddulic · ddulic · Mar 17, 2026 · Mar 17, 2026 · Mar 17, 2026 · Mar 17, 2026
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -25,6 +25,12 @@ repos:
         language: script
         types: [python]
         require_serial: true
+      - id: pytest-staged
+        name: pytest (staged files)
+        entry: script/pre-commit-test
+        language: script
+        pass_filenames: false
+        require_serial: true
   - repo: https://github.com/codespell-project/codespell
     rev: v2.4.1
     hooks:

diff --git a/.specify/memory/constitution.md b/.specify/memory/constitution.md
@@ -105,6 +105,9 @@ merged without a corresponding test that was authored before the implementation.
   mocking the database is prohibited.
 - Completeness tests (e.g., `test_*_completeness.py`) MUST be maintained
   alongside every model module to ensure all fields are round-trip serializable.
+- **100% line coverage is required on all new and changed code before committing.**
+  Every branch, error path, and early return MUST have a corresponding test.
+  Code MUST NOT be committed until coverage is verified locally.
 
 **Rationale**: The existing test suite (80+ test files spanning routes, services,
 device protocol, MCP, security, and models) demonstrates that comprehensive tests
@@ -224,7 +227,9 @@ All standard operations MUST use the scripts in `script/` following the
 development and integration testing. It starts with a clean state and a
 pre-seeded debug user and MUST NOT persist data between runs.
 
-Pull requests MUST pass all CI gates (lint, type check, tests) before merge.
+Pull requests MUST pass all CI gates (lint, type check, tests, coverage) before merge.
+New and changed code MUST achieve 100% line coverage; PRs that reduce patch
+coverage below 100% MUST NOT be merged without explicit justification.
 The `main` branch is the source of truth; GitHub Pages documentation is
 auto-deployed on every push to `main` via `pdoc`.
 

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -7,6 +7,8 @@ Auto-generated from all feature plans. Last updated: 2026-03-17
 - N/A (no Python source changes — CI/CD configuration only) + GitHub Actions (`docker/metadata-action`, `docker/build-push-action`, `docker/login-action`) (003-github-releases)
 - Python 3.13+ + aiohttp (server), SQLAlchemy asyncio + aiosqlite, mashumaro, alembic; Vanilla JS (Vue 3, no build step) for frontend (004-ui-prompt-config)
 - SQLite via SQLAlchemy asyncio — new `f_prompt_config` table; new `prompt_hash` column on `f_note_page_content` (004-ui-prompt-config)
+- Python 3.13+ (backend), Vanilla JS / Vue 3 ESM (frontend) + aiohttp, SQLAlchemy asyncio + aiosqlite, mashumaro, alembic (005-cache-png-insights-tabs)
+- SQLite (DB via SQLAlchemy), LocalBlobStorage (disk — `supernote-user-data` bucket) (005-cache-png-insights-tabs)
 
 - Python 3.13+ + mypy (strict), SQLAlchemy asyncio, aiohttp, mashumaro, pytest + pytest-asyncio (001-constitution-alignment)
 
@@ -64,9 +66,9 @@ Button Tailwind classes — use verbatim:
 - **Dark mode**: every interactive element must have `dark:` variants
 
 ## Recent Changes
+- 005-cache-png-insights-tabs: Added Python 3.13+ (backend), Vanilla JS / Vue 3 ESM (frontend) + aiohttp, SQLAlchemy asyncio + aiosqlite, mashumaro, alembic
 - 004-ui-prompt-config: Added Python 3.13+ + aiohttp (server), SQLAlchemy asyncio + aiosqlite, mashumaro, alembic; Vanilla JS (Vue 3, no build step) for frontend
 - 003-github-releases: Added N/A (no Python source changes — CI/CD configuration only) + GitHub Actions (`docker/metadata-action`, `docker/build-push-action`, `docker/login-action`)
-- 002-switch-dependabot: Added N/A (no Python source changes) + GitHub Dependabot (native GitHub feature, no external service)
 
 
 <!-- MANUAL ADDITIONS START -->

diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ requires = ["setuptools>=77.0"]
 
 [project]
 name = "supernote"
-version = "1.1.0"
+version = "1.2.0"
 license = "Apache-2.0"
 license-files = ["LICENSE"]
 description = "All-in-one toolkit for Supernote devices: parse notebooks, self-host services, access services"

diff --git a/script/pre-commit-test b/script/pre-commit-test
@@ -0,0 +1,50 @@
+#!/usr/bin/env bash
+# script/pre-commit-test: Run tests for staged Python files before committing.
+# Invoked automatically by pre-commit before each commit.
+
+set -e
+cd "$(dirname "$0")/.."
+
+# Collect staged Python files
+staged_src=$(git diff --cached --name-only --diff-filter=ACM | grep '^supernote/.*\.py$' || true)
+staged_tests=$(git diff --cached --name-only --diff-filter=ACM | grep '^tests/.*\.py$' || true)
+
+# Nothing Python-related staged — skip
+if [ -z "$staged_src" ] && [ -z "$staged_tests" ]; then
+    exit 0
+fi
+
+# Map source files -> corresponding test files
+test_files=()
+
+for f in $staged_src; do
+    rel="${f#supernote/}"
+    dir=$(dirname "$rel")
+    base=$(basename "$rel" .py)
+    candidate="tests/${dir}/test_${base}.py"
+    if [ -f "$candidate" ]; then
+        test_files+=("$candidate")
+    fi
+done
+
+# Include any staged test files directly
+for f in $staged_tests; do
+    test_files+=("$f")
+done
+
+# Deduplicate test files
+IFS=$'\n' read -r -d '' -a test_files < <(printf '%s\n' "${test_files[@]}" | sort -u && printf '\0') || true
+
+if [ ${#test_files[@]} -eq 0 ]; then
+    exit 0
+fi
+
+echo "==> Running pre-commit tests for staged files..."
+
+if command -v uv >/dev/null 2>&1; then
+    runner="uv run pytest"
+else
+    runner="pytest"
+fi
+
+$runner "${test_files[@]}" -q
diff --git a/specs/005-cache-png-insights-tabs/checklists/requirements.md b/specs/005-cache-png-insights-tabs/checklists/requirements.md
@@ -0,0 +1,34 @@
+# Specification Quality Checklist: Note Page PNG Caching & Insights Panel Tabs
+
+**Purpose**: Validate specification completeness and quality before proceeding to planning
+**Created**: 2026-03-17
+**Feature**: [spec.md](../spec.md)
+
+## Content Quality
+
+- [x] No implementation details (languages, frameworks, APIs)
+- [x] Focused on user value and business needs
+- [x] Written for non-technical stakeholders
+- [x] All mandatory sections completed
+
+## Requirement Completeness
+
+- [x] No [NEEDS CLARIFICATION] markers remain
+- [x] Requirements are testable and unambiguous
+- [x] Success criteria are measurable
+- [x] Success criteria are technology-agnostic (no implementation details)
+- [x] All acceptance scenarios are defined
+- [x] Edge cases are identified
+- [x] Scope is clearly bounded
+- [x] Dependencies and assumptions identified
+
+## Feature Readiness
+
+- [x] All functional requirements have clear acceptance criteria
+- [x] User scenarios cover primary flows
+- [x] Feature meets measurable outcomes defined in Success Criteria
+- [x] No implementation details leak into specification
+
+## Notes
+
+- All items pass. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
diff --git a/specs/005-cache-png-insights-tabs/contracts/ocr-list-endpoint.md b/specs/005-cache-png-insights-tabs/contracts/ocr-list-endpoint.md
@@ -0,0 +1,60 @@
+# Contract: OCR Page List Endpoint
+
+**Endpoint**: `POST /api/extended/file/ocr/list`
+**Auth**: Required — `x-access-token: <JWT>` header
+**Ownership**: Users may only retrieve OCR for files they own; cross-user access returns 403.
+
+---
+
+## Request
+
+**Content-Type**: `application/json`
+
+```json
+{
+  "fileId": 12345
+}
+```
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `fileId` | integer | Yes | ID of the note file |
+
+---
+
+## Response — 200 OK
+
+```json
+{
+  "pages": [
+    { "pageIndex": 0, "textContent": "Handwriting extracted from page 1..." },
+    { "pageIndex": 1, "textContent": "Handwriting extracted from page 2..." }
+  ]
+}
+```
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `pages` | array | Ordered by `pageIndex` ascending. Empty array if no OCR available. Only pages with non-null text are included. |
+| `pages[].pageIndex` | integer | 0-based page position in the note |
+| `pages[].textContent` | string | Raw OCR text from that page |
+
+---
+
+## Error Responses
+
+| Status | Condition |
+|--------|-----------|
+| 400 | Malformed JSON body or missing `fileId` |
+| 401 | Missing or invalid JWT |
+| 403 | File belongs to a different user |
+| 404 | File not found |
+| 500 | Unexpected server error |
+
+---
+
+## Notes
+
+- Returns an empty `pages` array (not a 404) when the file exists but has no OCR data yet (processing pending).
+- Does not paginate — all available pages returned in a single response.
+- Follows the same convention as `POST /api/extended/file/summary/list`.
diff --git a/specs/005-cache-png-insights-tabs/data-model.md b/specs/005-cache-png-insights-tabs/data-model.md
@@ -0,0 +1,65 @@
+# Data Model: Note Page PNG Caching & Insights Panel Tabs
+
+**Feature**: 005-cache-png-insights-tabs
+**Date**: 2026-03-17
+
+---
+
+## Database Changes
+
+### `f_user_file` table — new column
+
+| Column | Type | Nullable | Default | Purpose |
+|--------|------|----------|---------|---------|
+| `last_conversion_md5` | `String` | Yes | `NULL` | Stores the file MD5 used during the most recent `convert_note_to_png` call. Used to detect content changes and reconstruct old storage keys for cleanup. |
+
+**Migration**: New Alembic revision under `supernote/alembic/versions/`. Follows existing pattern (`add_column` with `nullable=True`, no `server_default` needed — rows for files that have never been converted will have `NULL`, which is treated as "no previous conversion").
+
+**SQLAlchemy model addition** (`supernote/server/db/models/file.py` — `UserFileDO`):
+```python
+last_conversion_md5: Mapped[str | None] = mapped_column(String, nullable=True)
+"""MD5 of the file at last PNG conversion; used for stale image cleanup."""
+```
+
+---
+
+## New DTOs / VOs (`supernote/models/extended.py`)
+
+### `OcrPageVO`
+
+| Field | Python type | JSON alias | Notes |
+|-------|-------------|------------|-------|
+| `page_index` | `int` | `pageIndex` | 0-based page number, ordered ascending |
+| `text_content` | `str` | `textContent` | Raw OCR text extracted from the page |
+
+### `WebOcrListRequestDTO`
+
+| Field | Python type | JSON alias | Notes |
+|-------|-------------|------------|-------|
+| `file_id` | `int` | `fileId` | ID of the note file to retrieve OCR for |
+
+### `WebOcrListVO`
+
+| Field | Python type | JSON alias | Notes |
+|-------|-------------|------------|-------|
+| `pages` | `list[OcrPageVO]` | `pages` | All pages with OCR text, ordered by `page_index`. Empty list if none available. |
+
+All three use `@dataclass` + `DataClassJSONMixin` with `omit_none=True` and `BaseConfig(serialize_by_alias=True)` consistent with existing models in the file.
+
+---
+
+## Storage Key Patterns
+
+No new storage key formats. Existing patterns used:
+
+| Pattern | Bucket | Usage |
+|---------|--------|-------|
+| `conversions/{user_id}/{file_id}/page_{page_index}_{md5}.png` | `supernote-user-data` | Cached page image. Presence = content unchanged. |
+
+**Cleanup logic**: When `last_conversion_md5 != node.md5` (content changed), old images are deleted by reconstructing: `get_conversion_png_path(user_id, file_id, i, last_conversion_md5)` for `i` in `0..note.get_total_pages()-1`.
+
+---
+
+## No New Tables
+
+No new DB tables are required. OCR text is read from the existing `f_note_page_content.text_content` column (populated by the existing OCR processing pipeline).