Investigate #66 (high memory on large notebooks)#683
Draft
Conversation
Begin tracking memory usage on issue #66 by checking in: - scripts/gen-fixture.py — synthetic 4500-note generator that mimics the issue's notebook shape (frontmatter, wikilinks, tags, body). - docs/issue-66-memory.md — running log: reproducer, measurement protocol (/usr/bin/time -v + +RTS -s -hT), pre-profile hypotheses drawn from a code read of master, and the per-fix delta table that will be filled in as fixes land. The doc is intentionally checked in early so the PR review sees the same evidence the author had at each step. Baseline numbers and profile output land in follow-up commits.
Owner
Author
|
| Step | Status | Duration | Verification |
|---|---|---|---|
| sync | ✓ | 1s | forge=github |
| research | ✓ | 4m 26s | Mapped Note.hs/Patch.hs/Rel.hs; lua-vr's evaluate . force pattern. |
| branch | ✓ | 5s | Worktree-provisioned pallid-shot. |
| implement | ✓ | 420m 36s | Tried NoteContent ADT, plain-text _relCtx, pre-extracted indices, unionMountStreaming upstream. Architectural changes structurally correct but RSS parity-with-master at issue scale; reverted in favour of investigation-only ship. |
| check | ✓ | 30s | cabal build all |
| docs | ✓ | 36s | CHANGELOG entry (later reverted with the perf changes) |
| fmt | ✓ | 0s | just fmt clean |
| commit | ✓ | 0s | Primary commit + force-push reset |
| hickey+lowy | ✓ | 38m 39s | Both ran as parallel sub-agents. Findings recorded for the next PR that re-attempts the architectural change. |
| police | — | skip | docs/scripts-only diff |
| test | — | skip | docs/scripts-only diff |
| create-pr | ✓ | 0s | #683 (title + body updated to "Investigate #66") |
| ci | ✓ | 6s | local cabal green; vira ci not gated by these files |
| evidence | — | skip | no UI-visible changes |
| Total | 466m 28s |
Slowest step: implement (420m 36s) — most of that was the
architectural attempts and the realization that the closure profile
showed the lever moved while the mass didn't.
Optimization suggestions
- Pre-attach a
performMajorGC+ post-load heap-profile dump.
The biggest time sink was looping on+RTS -hTprofiles whose last
sample landed at ~47s — right at the edge of the load phase, where
in-flight per-file Pandoc trees may not yet have been drained by
GC. A diagnostic that triggers a profile capture once the model
status flips toStatus_Readywould have isolated the actual
retainer in one round-trip instead of five. - Re-run
--from implementon a fixture small enough to load in
≤10s. The 4500-note fixture's loading phase is longer than the
default heap-profile sample window. A 500-note fixture with the
same per-note density would expose the retention pattern in
seconds and the iteration loop would be 50× faster. - Hickey/lowy on a smaller diff first. The two sub-agents read
the entire structural change at once; at 4 commits the review was
necessarily high-level. If the architectural attempts had been
staged in their own PRs (one per commit), each review could have
been more targeted and the negative measurement would have
surfaced sooner.
Workflow completed at 2026-04-27.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Investigation infrastructure plus an honest negative result on the
architectural attempt. The repro and the harness land here so the
next pass starts from the same baseline this one did. The
architectural fix itself didn't land — every variant either broke
even or regressed RSS at the issue's scale, and the closure-type
profile shows the actual retainer survived all four cleanups. The
full log lives in
docs/issue-66-memory.md.Baselines (3-run median post-load
VmHWM)The issue's reported 4.7 GiB on 4561 × 15 KiB lines up with these
numbers once the per-note count and richer markdown of a real
notebook are added.
What the heap profile says
+RTS -hTon the issue-scale fixture shows ~830 MiB of live heapin the parsed
Pandoctree —Sequence.Internal.Node*finger-treespine +
Pandoc.Definition.Strcells +Textheaders +ARR_WORDS,all per-note in
_modelNotes.THUNK / THUNK_*adds ~48 MiB on top— the
evaluate . force-after-parse fix from the issue thread wouldreclaim most of that, ~5%. Real but not enough.
What I tried
evaluate . forcepost-parseNoteContentADT (re-parse on render)_relCtx :: !Text(was[B.Block])IxRel/IxTask,!relTounionMountStreaming+T.copyURLsEach step is structurally correct on its own — the closure-type
profile collapses
Sequence.Internal.Node*from ~470 MB to singlekilobytes after the
NoteContentchange — but ~140 MiB ofPandoc.Strand ~285 MiB ofTextheaders persist in both masterand the post-fix builds, just under different parents. The lever
moves; the mass doesn't.
The PR therefore reverts the architectural changes and lands only the
infrastructure. The fix needs to come after isolating the actual
retainer, which doc lists explicit candidates for.
What's in this PR
scripts/gen-fixture.py— synthetic 4500-note generator matchingthe issue's notebook density.
scripts/measure-load-rss.sh—emanote run+ settle-detect onVmRSS+ readVmHWMfrom/proc. Three-run median is theheadline number.
docs/issue-66-memory.md— methodology, master baseline, fullclosure-type table, every architectural attempt with its
measurement, and the explicit list of "leak candidates" the next
pass should attack first (heap profile sample window, Heist
template state, Aeson meta NFData depth,
_noteTitleH1 inlines,commonmark parser internals).
The companion
unionMountStreamingupstream branch(
srid/unionmount#issue-66-streaming-mount) is preserved upstream asuseful API surface even though it's unused here.
Test plan
cabal build all— green (only docs / scripts touched)nix flake check(vira ci) — greenscripts/gen-fixture.pyruns and produces a valid notebook