Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ repos:
language: script
types: [python]
require_serial: true
- id: pytest-staged
name: pytest (staged files)
entry: script/pre-commit-test
language: script
pass_filenames: false
require_serial: true
- repo: https://github.com/codespell-project/codespell
rev: v2.4.1
hooks:
Expand Down
7 changes: 6 additions & 1 deletion .specify/memory/constitution.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,9 @@ merged without a corresponding test that was authored before the implementation.
mocking the database is prohibited.
- Completeness tests (e.g., `test_*_completeness.py`) MUST be maintained
alongside every model module to ensure all fields are round-trip serializable.
- **100% line coverage is required on all new and changed code before committing.**
Every branch, error path, and early return MUST have a corresponding test.
Code MUST NOT be committed until coverage is verified locally.

**Rationale**: The existing test suite (80+ test files spanning routes, services,
device protocol, MCP, security, and models) demonstrates that comprehensive tests
Expand Down Expand Up @@ -224,7 +227,9 @@ All standard operations MUST use the scripts in `script/` following the
development and integration testing. It starts with a clean state and a
pre-seeded debug user and MUST NOT persist data between runs.

Pull requests MUST pass all CI gates (lint, type check, tests) before merge.
Pull requests MUST pass all CI gates (lint, type check, tests, coverage) before merge.
New and changed code MUST achieve 100% line coverage; PRs that reduce patch
coverage below 100% MUST NOT be merged without explicit justification.
The `main` branch is the source of truth; GitHub Pages documentation is
auto-deployed on every push to `main` via `pdoc`.

Expand Down
4 changes: 3 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Auto-generated from all feature plans. Last updated: 2026-03-17
- N/A (no Python source changes — CI/CD configuration only) + GitHub Actions (`docker/metadata-action`, `docker/build-push-action`, `docker/login-action`) (003-github-releases)
- Python 3.13+ + aiohttp (server), SQLAlchemy asyncio + aiosqlite, mashumaro, alembic; Vanilla JS (Vue 3, no build step) for frontend (004-ui-prompt-config)
- SQLite via SQLAlchemy asyncio — new `f_prompt_config` table; new `prompt_hash` column on `f_note_page_content` (004-ui-prompt-config)
- Python 3.13+ (backend), Vanilla JS / Vue 3 ESM (frontend) + aiohttp, SQLAlchemy asyncio + aiosqlite, mashumaro, alembic (005-cache-png-insights-tabs)
- SQLite (DB via SQLAlchemy), LocalBlobStorage (disk — `supernote-user-data` bucket) (005-cache-png-insights-tabs)

- Python 3.13+ + mypy (strict), SQLAlchemy asyncio, aiohttp, mashumaro, pytest + pytest-asyncio (001-constitution-alignment)

Expand Down Expand Up @@ -64,9 +66,9 @@ Button Tailwind classes — use verbatim:
- **Dark mode**: every interactive element must have `dark:` variants

## Recent Changes
- 005-cache-png-insights-tabs: Added Python 3.13+ (backend), Vanilla JS / Vue 3 ESM (frontend) + aiohttp, SQLAlchemy asyncio + aiosqlite, mashumaro, alembic
- 004-ui-prompt-config: Added Python 3.13+ + aiohttp (server), SQLAlchemy asyncio + aiosqlite, mashumaro, alembic; Vanilla JS (Vue 3, no build step) for frontend
- 003-github-releases: Added N/A (no Python source changes — CI/CD configuration only) + GitHub Actions (`docker/metadata-action`, `docker/build-push-action`, `docker/login-action`)
- 002-switch-dependabot: Added N/A (no Python source changes) + GitHub Dependabot (native GitHub feature, no external service)


<!-- MANUAL ADDITIONS START -->
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ requires = ["setuptools>=77.0"]

[project]
name = "supernote"
version = "1.1.0"
version = "1.2.0"
license = "Apache-2.0"
license-files = ["LICENSE"]
description = "All-in-one toolkit for Supernote devices: parse notebooks, self-host services, access services"
Expand Down
50 changes: 50 additions & 0 deletions script/pre-commit-test
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/usr/bin/env bash
# script/pre-commit-test: Run tests for staged Python files before committing.
# Invoked automatically by pre-commit before each commit.

set -e
cd "$(dirname "$0")/.."

# Collect staged Python files
staged_src=$(git diff --cached --name-only --diff-filter=ACM | grep '^supernote/.*\.py$' || true)
staged_tests=$(git diff --cached --name-only --diff-filter=ACM | grep '^tests/.*\.py$' || true)

# Nothing Python-related staged — skip
if [ -z "$staged_src" ] && [ -z "$staged_tests" ]; then
exit 0
fi

# Map source files -> corresponding test files
test_files=()

for f in $staged_src; do
rel="${f#supernote/}"
dir=$(dirname "$rel")
base=$(basename "$rel" .py)
candidate="tests/${dir}/test_${base}.py"
if [ -f "$candidate" ]; then
test_files+=("$candidate")
fi
done

# Include any staged test files directly
for f in $staged_tests; do
test_files+=("$f")
done

# Deduplicate test files
IFS=$'\n' read -r -d '' -a test_files < <(printf '%s\n' "${test_files[@]}" | sort -u && printf '\0') || true

if [ ${#test_files[@]} -eq 0 ]; then
exit 0
fi

echo "==> Running pre-commit tests for staged files..."

if command -v uv >/dev/null 2>&1; then
runner="uv run pytest"
else
runner="pytest"
fi

$runner "${test_files[@]}" -q
34 changes: 34 additions & 0 deletions specs/005-cache-png-insights-tabs/checklists/requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Specification Quality Checklist: Note Page PNG Caching & Insights Panel Tabs

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-03-17
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

- All items pass. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
60 changes: 60 additions & 0 deletions specs/005-cache-png-insights-tabs/contracts/ocr-list-endpoint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Contract: OCR Page List Endpoint

**Endpoint**: `POST /api/extended/file/ocr/list`
**Auth**: Required — `x-access-token: <JWT>` header
**Ownership**: Users may only retrieve OCR for files they own; cross-user access returns 403.

---

## Request

**Content-Type**: `application/json`

```json
{
"fileId": 12345
}
```

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `fileId` | integer | Yes | ID of the note file |

---

## Response — 200 OK

```json
{
"pages": [
{ "pageIndex": 0, "textContent": "Handwriting extracted from page 1..." },
{ "pageIndex": 1, "textContent": "Handwriting extracted from page 2..." }
]
}
```

| Field | Type | Description |
|-------|------|-------------|
| `pages` | array | Ordered by `pageIndex` ascending. Empty array if no OCR available. Only pages with non-null text are included. |
| `pages[].pageIndex` | integer | 0-based page position in the note |
| `pages[].textContent` | string | Raw OCR text from that page |

---

## Error Responses

| Status | Condition |
|--------|-----------|
| 400 | Malformed JSON body or missing `fileId` |
| 401 | Missing or invalid JWT |
| 403 | File belongs to a different user |
| 404 | File not found |
| 500 | Unexpected server error |

---

## Notes

- Returns an empty `pages` array (not a 404) when the file exists but has no OCR data yet (processing pending).
- Does not paginate — all available pages returned in a single response.
- Follows the same convention as `POST /api/extended/file/summary/list`.
65 changes: 65 additions & 0 deletions specs/005-cache-png-insights-tabs/data-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Data Model: Note Page PNG Caching & Insights Panel Tabs

**Feature**: 005-cache-png-insights-tabs
**Date**: 2026-03-17

---

## Database Changes

### `f_user_file` table — new column

| Column | Type | Nullable | Default | Purpose |
|--------|------|----------|---------|---------|
| `last_conversion_md5` | `String` | Yes | `NULL` | Stores the file MD5 used during the most recent `convert_note_to_png` call. Used to detect content changes and reconstruct old storage keys for cleanup. |

**Migration**: New Alembic revision under `supernote/alembic/versions/`. Follows existing pattern (`add_column` with `nullable=True`, no `server_default` needed — rows for files that have never been converted will have `NULL`, which is treated as "no previous conversion").

**SQLAlchemy model addition** (`supernote/server/db/models/file.py` — `UserFileDO`):
```python
last_conversion_md5: Mapped[str | None] = mapped_column(String, nullable=True)
"""MD5 of the file at last PNG conversion; used for stale image cleanup."""
```

---

## New DTOs / VOs (`supernote/models/extended.py`)

### `OcrPageVO`

| Field | Python type | JSON alias | Notes |
|-------|-------------|------------|-------|
| `page_index` | `int` | `pageIndex` | 0-based page number, ordered ascending |
| `text_content` | `str` | `textContent` | Raw OCR text extracted from the page |

### `WebOcrListRequestDTO`

| Field | Python type | JSON alias | Notes |
|-------|-------------|------------|-------|
| `file_id` | `int` | `fileId` | ID of the note file to retrieve OCR for |

### `WebOcrListVO`

| Field | Python type | JSON alias | Notes |
|-------|-------------|------------|-------|
| `pages` | `list[OcrPageVO]` | `pages` | All pages with OCR text, ordered by `page_index`. Empty list if none available. |

All three use `@dataclass` + `DataClassJSONMixin` with `omit_none=True` and `BaseConfig(serialize_by_alias=True)` consistent with existing models in the file.

---

## Storage Key Patterns

No new storage key formats. Existing patterns used:

| Pattern | Bucket | Usage |
|---------|--------|-------|
| `conversions/{user_id}/{file_id}/page_{page_index}_{md5}.png` | `supernote-user-data` | Cached page image. Presence = content unchanged. |

**Cleanup logic**: When `last_conversion_md5 != node.md5` (content changed), old images are deleted by reconstructing: `get_conversion_png_path(user_id, file_id, i, last_conversion_md5)` for `i` in `0..note.get_total_pages()-1`.

---

## No New Tables

No new DB tables are required. OCR text is read from the existing `f_note_page_content.text_content` column (populated by the existing OCR processing pipeline).
Loading
Loading