Investigate #66 (high memory on large notebooks) by srid · Pull Request #683 · srid/emanote

srid · 2026-04-27T19:38:49Z

Investigation infrastructure plus an honest negative result on the
architectural attempt. The repro and the harness land here so the
next pass starts from the same baseline this one did. The
architectural fix itself didn't land — every variant either broke
even or regressed RSS at the issue's scale, and the closure-type
profile shows the actual retainer survived all four cleanups. The
full log lives in docs/issue-66-memory.md.

Baselines (3-run median post-load `VmHWM`)

Fixture	On disk	Master peak RSS
4500 notes × 5 KiB body	24 MiB	1268 MiB
4500 notes × 15 KiB body	55 MiB	2420 MiB

The issue's reported 4.7 GiB on 4561 × 15 KiB lines up with these
numbers once the per-note count and richer markdown of a real
notebook are added.

What the heap profile says

+RTS -hT on the issue-scale fixture shows ~830 MiB of live heap
in the parsed Pandoc tree — Sequence.Internal.Node* finger-tree
spine + Pandoc.Definition.Str cells + Text headers + ARR_WORDS,
all per-note in _modelNotes. THUNK / THUNK_* adds ~48 MiB on top
— the evaluate . force-after-parse fix from the issue thread would
reclaim most of that, ~5%. Real but not enough.

What I tried

Stack	Issue-scale (15 KiB/note)
master @ `5518fb2`	2420 MiB
+ `evaluate . force` post-parse	no measurement
+ `NoteContent` ADT (re-parse on render)	2563 MiB
+ `_relCtx :: !Text` (was `[B.Block]`)	2592 MiB
+ pre-extract `IxRel`/`IxTask`, `!relTo`	2563 MiB
+ `unionMountStreaming` + `T.copy` URLs	2473 MiB (parity)

Each step is structurally correct on its own — the closure-type
profile collapses Sequence.Internal.Node* from ~470 MB to single
kilobytes after the NoteContent change — but ~140 MiB of
Pandoc.Str and ~285 MiB of Text headers persist in both master
and the post-fix builds, just under different parents. The lever
moves; the mass doesn't.

The PR therefore reverts the architectural changes and lands only the
infrastructure. The fix needs to come after isolating the actual
retainer, which doc lists explicit candidates for.

What's in this PR

scripts/gen-fixture.py — synthetic 4500-note generator matching
the issue's notebook density.
scripts/measure-load-rss.sh — emanote run + settle-detect on
VmRSS + read VmHWM from /proc. Three-run median is the
headline number.
docs/issue-66-memory.md — methodology, master baseline, full
closure-type table, every architectural attempt with its
measurement, and the explicit list of "leak candidates" the next
pass should attack first (heap profile sample window, Heist
template state, Aeson meta NFData depth, _noteTitle H1 inlines,
commonmark parser internals).

The companion unionMountStreaming upstream branch
(srid/unionmount#issue-66-streaming-mount) is preserved upstream as
useful API surface even though it's unused here.

Test plan

cabal build all — green (only docs / scripts touched)
nix flake check (vira ci) — green
scripts/gen-fixture.py runs and produces a valid notebook

Begin tracking memory usage on issue #66 by checking in: - scripts/gen-fixture.py — synthetic 4500-note generator that mimics the issue's notebook shape (frontmatter, wikilinks, tags, body). - docs/issue-66-memory.md — running log: reproducer, measurement protocol (/usr/bin/time -v + +RTS -s -hT), pre-profile hypotheses drawn from a code read of master, and the per-fix delta table that will be filled in as fixes land. The doc is intentionally checked in early so the PR review sees the same evidence the author had at each step. Baseline numbers and profile output land in follow-up commits.

srid · 2026-04-27T23:06:36Z

`/do` results

Step	Status	Duration	Verification
sync	✓	1s	forge=github
research	✓	4m 26s	Mapped Note.hs/Patch.hs/Rel.hs; lua-vr's `evaluate . force` pattern.
branch	✓	5s	Worktree-provisioned `pallid-shot`.
implement	✓	420m 36s	Tried NoteContent ADT, plain-text `_relCtx`, pre-extracted indices, `unionMountStreaming` upstream. Architectural changes structurally correct but RSS parity-with-master at issue scale; reverted in favour of investigation-only ship.
check	✓	30s	`cabal build all`
docs	✓	36s	CHANGELOG entry (later reverted with the perf changes)
fmt	✓	0s	`just fmt` clean
commit	✓	0s	Primary commit + force-push reset
hickey+lowy	✓	38m 39s	Both ran as parallel sub-agents. Findings recorded for the next PR that re-attempts the architectural change.
police	—	skip	docs/scripts-only diff
test	—	skip	docs/scripts-only diff
create-pr	✓	0s	#683 (title + body updated to "Investigate #66")
ci	✓	6s	local cabal green; vira ci not gated by these files
evidence	—	skip	no UI-visible changes
Total		466m 28s

Slowest step: implement (420m 36s) — most of that was the
architectural attempts and the realization that the closure profile
showed the lever moved while the mass didn't.

Optimization suggestions

Pre-attach a performMajorGC + post-load heap-profile dump.
The biggest time sink was looping on +RTS -hT profiles whose last
sample landed at ~47s — right at the edge of the load phase, where
in-flight per-file Pandoc trees may not yet have been drained by
GC. A diagnostic that triggers a profile capture once the model
status flips to Status_Ready would have isolated the actual
retainer in one round-trip instead of five.
Re-run --from implement on a fixture small enough to load in
≤10s. The 4500-note fixture's loading phase is longer than the
default heap-profile sample window. A 500-note fixture with the
same per-note density would expose the retention pattern in
seconds and the iteration loop would be 50× faster.
Hickey/lowy on a smaller diff first. The two sub-agents read
the entire structural change at once; at 4 commits the review was
necessarily high-level. If the architectural attempts had been
staged in their own PRs (one per commit), each review could have
been more targeted and the negative measurement would have
surfaced sooner.

Workflow completed at 2026-04-27.

srid force-pushed the pallid-shot branch from f0ab95e to 6166df6 Compare April 27, 2026 23:04

srid changed the title ~~Slim memory on large notebooks (#66)~~ Investigate #66 (high memory on large notebooks) Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate #66 (high memory on large notebooks)#683

Investigate #66 (high memory on large notebooks)#683
srid wants to merge 1 commit intomasterfrom
pallid-shot

srid commented Apr 27, 2026 •

edited

Loading

Uh oh!

srid commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

srid commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Baselines (3-run median post-load VmHWM)

What the heap profile says

What I tried

What's in this PR

Test plan

Uh oh!

srid commented Apr 27, 2026

/do results

Optimization suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

srid commented Apr 27, 2026 •

edited

Loading

Baselines (3-run median post-load `VmHWM`)

`/do` results