Skip to content

Slim large notebooks with deferred notes#678

Draft
srid wants to merge 4 commits intomasterfrom
fix-issue-66-memory
Draft

Slim large notebooks with deferred notes#678
srid wants to merge 4 commits intomasterfrom
fix-issue-66-memory

Conversation

@srid
Copy link
Copy Markdown
Owner

@srid srid commented Apr 27, 2026

Large notebooks no longer require retaining every note as a full Pandoc document at startup. Issue #66 reported 4,561 Markdown files, 69M on disk, roughly 1 minute of startup, and 4.7GB RSS; this branch reproduces the memory pressure, profiles the retained heap, and changes the model path so simple Markdown notes keep metadata, title, tags, source text, and a compact relation index. The full Pandoc parse is deferred until rendering needs the note body.

The measured fix is code-level, not an RTS flag workaround. The baseline 2,401-file reproduction reached 1,478.5 MiB RSS while still in startup and did not answer before a 300s readiness timeout. The final measured binary answered the 2,401-file generated notebook in 3.583s and peaked at 457.3 MiB after 720 rewrites; on the issue-sized 4,561-file / 82M generated notebook, it answered in 9.690s and peaked at 752.2 MiB after 1,368 rewrites.

The last major reduction came from measuring relation cardinality, not guessing. A structural deferred-note refactor without relation dedup still peaked at 1,688.7 MiB on the 4,561-file workload. Keeping one representative deferred relation per source note and unresolved target, while recovering repeated backlink contexts from source text on demand, reduced that same workload to 752.2 MiB.

The measurement report is part of the PR at measurements/issue-66-memory.md, including raw /tmp data paths, profiling cost centres, rejected experiments, exact workload shape, and a QuickChart RSS comparison. The implementation also batches LML model updates and replaces the ad hoc lazy-note fields with explicit NoteBody, DeferredNote, and ParseContext types.

The PR intentionally keeps the dead-end measurements in the report: strictness-only, dropping retained fields after parsing, per-note compact regions, relation-context sharing without dedup, and RTS controls were measured and rejected as the fix.

Closes #66.

Try it locally

nix build github:srid/emanote/fix-issue-66-memory

Generated by /do on Codex (model gpt-5).

@srid srid changed the title Profile large-notebook memory pressure Slim large notebooks with lazy note parsing Apr 27, 2026
@srid srid changed the title Slim large notebooks with lazy note parsing Slim large notebooks with deferred notes Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

High memory usage on large notebooks

1 participant