Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Large notebooks no longer require retaining every note as a full Pandoc document at startup. Issue #66 reported 4,561 Markdown files, 69M on disk, roughly 1 minute of startup, and 4.7GB RSS; this branch reproduces the memory pressure, profiles the retained heap, and changes the model path so simple Markdown notes keep metadata, title, tags, source text, and a compact relation index. The full Pandoc parse is deferred until rendering needs the note body.
The measured fix is code-level, not an RTS flag workaround. The baseline 2,401-file reproduction reached 1,478.5 MiB RSS while still in startup and did not answer before a 300s readiness timeout. The final measured binary answered the 2,401-file generated notebook in 3.583s and peaked at 457.3 MiB after 720 rewrites; on the issue-sized 4,561-file / 82M generated notebook, it answered in 9.690s and peaked at 752.2 MiB after 1,368 rewrites.
The last major reduction came from measuring relation cardinality, not guessing. A structural deferred-note refactor without relation dedup still peaked at 1,688.7 MiB on the 4,561-file workload. Keeping one representative deferred relation per source note and unresolved target, while recovering repeated backlink contexts from source text on demand, reduced that same workload to 752.2 MiB.
The measurement report is part of the PR at
measurements/issue-66-memory.md, including raw/tmpdata paths, profiling cost centres, rejected experiments, exact workload shape, and a QuickChart RSS comparison. The implementation also batches LML model updates and replaces the ad hoc lazy-note fields with explicitNoteBody,DeferredNote, andParseContexttypes.Closes #66.
Try it locally
Generated by
/doon Codex (modelgpt-5).