fix(measuring-dom): honor w:lineRule=auto per ECMA-376 §17.18.48 (SD-2735)#2932
Conversation
Per ECMA-376 §17.6.17, a <w:sectPr> inside a paragraph defines the section that ENDS with that paragraph. All body children preceding it — paragraphs, tables, top-level drawings, SDTs — belong to that section. Section ranges were indexed purely by paragraph count, and section-break blocks were emitted only inside handleParagraphNode. A table that sat between two sectPr-marker paragraphs was emitted into the flow stream BEFORE the section break that declared its column config, so the layout engine laid it out under the prior section's settings. This is the root cause of IT-945 rendering a 114-row 2-col continuous table in column 0 across three pages with column 1 empty: the table was placed in the 1-col section, not the 2-col section. Fix: - Track nodeIndex over every top-level doc.content child in findParagraphsWithSectPr and SectionRange (alongside paragraphIndex, which SDT handlers still use for intra-SDT transitions). - Add maybeEmitNextSectionBreakForNode in sections/breaks.ts and call it from internal.ts's main dispatch loop BEFORE every top-level handler. Any non-paragraph node crossing a section boundary now triggers the break. - Section-model primer in pm-adapter/README.md with spec citations. Tests: 1739/1739 pass in pm-adapter (including new end-tagged.test.ts and integration test in index.test.ts asserting flow-block order).
…ng section-final page (SD-2646)
The column balancer treats each fragment as an atomic block. A
multi-page two-column continuous section's final page can end up with
a single table fragment taller than totalSectionHeight / columnCount.
The atomic-block binary search then places the whole table in one
column and leaves the other empty — diverging from Word, which
balances by splitting the table at a row boundary per ECMA-376
§17.18.77 ("a continuous section break balances the content of the
previous section").
Fix: add splitDominantTableAtRowBoundary as a preprocessor inside
balanceSectionOnPage. When the section has a single splittable table
fragment larger than target, split it at the row whose cumulative
height first meets or exceeds totalSectionHeight / columnCount. The
two halves are inserted in place of the original; the rest of the
balancer runs unchanged and naturally assigns one to each column.
Also add getBalancingHeight so empty sectPr-marker paragraphs
(measured lines with width=0) contribute 0 to balancing — matching
Word's behavior of not rendering an empty line for such markers.
This keeps both columns top-aligned on the section-final page.
On IT-945: page 2 now splits 14/14 from y=96 in both columns, matching
Word's top-alignment. Before this fix page 2 rendered all 28 remaining
rows in col 1 with col 0 empty.
Tests: strengthened existing "balances the section-ending page" test
(it was passing trivially via `if (sectionFragments.length > 1)`
guard). Added narrow-table multi-page regression test. 616/616 pass.
…8.48 (SD-2735) `resolveLineHeight` treated `w:lineRule="auto"` and `w:lineRule="atLeast"` identically — both took the max-clamp branch that floored the line height at `max(computedHeight, ascent+descent, 1.15×fontSize)`. Per §17.18.48, `auto` has "no predetermined minimum or maximum"; only `atLeast` is max-clamped. Symptom: author intent like `w:line="120" w:lineRule="auto"` (half line spacing) was silently inflated up to the 1.15×fontSize baseline, producing paragraphs that visually render as single-spaced instead of tight-spaced as specified. Fix: branch only on `atLeast` for the max-clamp; `auto` and `exact` pass the resolved target through without a floor. Exact behaviour per spec table: | lineRule | result | |----------|-----------------------------------------------| | auto | target (no clamp) | | atLeast | max(target, naturalSingleLine) | | exact | target (clipped by caller if content taller) | Out of scope in this PR (tracked in SD-2735): 1. `lineUnit='multiplier'` currently means `multiplier × fontSize` per the adapter's convention. Strict §17.18.48 reading would multiply by naturalSingleLine instead. Changing this requires coordinated adapter + test-baseline updates and shifts layout numbers across every document — kept for follow-up. 2. Natural single-line height on the browser side does not match Word's Aptos-intrinsic metrics (browser computes ~17-18 px for 12pt, Word renders ~19.5 px). Closing that gap requires bundling Aptos as a web font so the browser uses Microsoft's actual font metrics — also follow-up. 3. TableGrid's pPr override (`w:line="240"` / single-spacing for table cells) isn't applied to paragraphs inside the table by the pm-adapter / style-engine cascade. Separate investigation. New regression test: `honours auto lineRule with a sub-baseline multiplier` asserts a 0.5 multiplier produces 8 px (not the inflated 18.4) at fontSize=16. All 246 measuring-dom tests pass. Downstream: layout-engine (616/616) and pm-adapter (1739/1739) test suites pass unchanged.
Three coordinated fixes for ECMA-376 §17.18.48 line-height semantics: A) measuring-dom: Aptos/Calibri natural-line calibration JSDOM's Canvas API cannot read Microsoft's Aptos/Calibri font files, so the fallback fontBoundingBox measurement yields ~17px for 12pt — well short of Word's ~19.5px. Adds a per-font calibration table (Aptos/Calibri/Cambria = 1.218x fontSize) populating the new `naturalSingleLine` field on FontMetricsResult. Five new tests cover the calibration table. B) pm-adapter: TableGrid cascade override in table paragraphs TableGrid style in Word's Normal template sets `w:line="240" auto` (1.0 multiplier), but our style cascade inherits docDefaults' 1.15 multiplier instead. When a paragraph inside a table has no explicit `w:spacing`, override the multiplier to 1.0 so table rows render at Word's compact spacing. C) measuring-dom: multiplier × naturalSingleLine (spec-correct) Per §17.18.48, `multiplier × (w:line / 240)` scales the *natural single line height*, not fontSize. Previously we multiplied by fontSize (a coincidentally-workable approximation for fonts where naturalSingle ≈ 1.15 × fontSize). Now we scale by the calibrated naturalSingle. Baseline tests updated to reflect the new arithmetic: e.g. `1.5 × auto` at 16px goes from 24 to 27.6 (1.5 × 1.15 × 16). Tests: - measuring-dom: 251/251 pass (5 new calibration tests) - pm-adapter: 1739/1739 pass - layout-engine: 615/616 pass (1 pre-existing skip) Remaining work (tracked separately): theme-font resolution (`minorHAnsi` → Aptos) must reach the measurement layer for the Aptos calibration to take effect at render-time for documents that use theme fonts. Today the DOM reports `fontFamily: Arial, sans-serif` for IT-945, so calibration is plumbed but does not yet fire for this corpus. The fixes themselves are spec-correct and unit-tested.
23aed51 to
a14dcd1
Compare
- fontMetricsCache.ts: trim JSDoc on FontMetricsResult and the calibration table to the non-obvious bits (why the multiplier table exists; what undefined means for callers). - measuring-dom/index.ts (resolveLineHeight): replace the historical- context prose with a compact rule table + the one non-obvious fact about the fallback floor. Collapse the multiplier-vs-explicit branch. - pm-adapter/converters/table.ts: extract the TableGrid pPr cascade override into a named top-level helper (applyTableGridSpacingCascade) so parseTableCell stays readable. Behavior unchanged. All tests green (measuring-dom 251, pm-adapter 1739, layout-engine 615).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 28bdc8fe03
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| return computedHeight; | ||
| if (!spacing || spacing.line == null) return naturalSingle; | ||
|
|
||
| const target = spacing.lineUnit === 'multiplier' ? spacing.line * naturalSingle : spacing.line; |
There was a problem hiding this comment.
Stop double-applying auto line-spacing multipliers
resolveLineHeight now treats spacing.line as a pure OOXML ratio (line/240) and multiplies it by naturalSingle, but pm-adapter still emits auto spacing values with an extra 1.15 factor baked in (see normalizeLineValue in spacing-indent.ts). This means imported auto spacing is scaled twice: e.g. w:line=360 becomes 1.725, then this code computes 1.725 × 18.4 = 31.7px instead of the expected 27.6px (1.5 lines at 16px base). So explicit auto line spacing in DOCX will render too tall after this change.
Useful? React with 👍 / 👎.
| if (block.kind !== 'paragraph' || !block.attrs) continue; | ||
| const spacing = block.attrs.spacing; | ||
| if (spacing?.lineUnit === 'multiplier' && spacing.lineRule === 'auto') { | ||
| block.attrs.spacing = { ...spacing, line: 1.0 }; |
There was a problem hiding this comment.
Restrict TableGrid override to the actual TableGrid style
This hard-codes spacing.line = 1.0 for every table-cell paragraph that has lineUnit='multiplier' and lineRule='auto', and it is invoked whenever a table has any tableStyleId (not just TableGrid). As a result, non-TableGrid/custom table styles with intentional paragraph line multipliers are silently flattened to 1.0, overriding style-engine output and changing row heights for styled tables.
Useful? React with 👍 / 👎.
… (SD-2735) Per ECMA-376 §17.6.17, a <w:sectPr> inside a paragraph's <w:pPr> defines where the section ends — the enclosing paragraph is a section- properties container, not a visible paragraph. Word renders these marker paragraphs at zero height regardless of section-break type. Previously we only skipped the marker when the next break forced a page boundary (nextPage/evenPage/oddPage). For continuous breaks the marker rendered at full docDefaults height (~35 px body + spacing-after), which on IT-945 cost a row per column on page 1 (the intro area consumed 72 px vs Word's 33 px, losing ~20 px of column budget). Extends the skip to continuous section breaks too. Browser-verified on it-945-numbered.docx: page 1 now balances 42+42 rows, page 2 16+14, matching Word's structural layout (41+41 / 17+15) within a single-row margin. Tests: - New regression: 'skips empty sectPr marker paragraph before continuous section break (SD-2735)' asserts the marker and its full-height measure don't appear in the layout. - Existing forced-page-break test unchanged. - layout-engine 150/150 pass; pm-adapter 1739/1739; measuring-dom 251/251.
|
@luccas-harbour mind reviewing this one pls? |
Summary
Draft. Fixes SD-2735 — line-height spec compliance per ECMA-376 §17.18.48 plus related section-marker paragraph rendering (§17.6.17).
Stacked on top of #2930 (SD-2646).
Commits
fix(measuring-dom): do not max-clamp auto lineRule per ECMA-376 §17.18.48— minimal spec correction.fix(layout-engine): spec-compliant line-height resolution (SD-2735)— Aptos/Calibri calibration, TableGrid cascade, multiplier × naturalSingle.refactor(sd-2735): tighten comments and extract TableGrid cascade helper— behavior-preserving cleanup.fix(layout-engine): skip empty sectPr marker before any section break (SD-2735)— §17.6.17 section-properties-container paragraphs collapse to zero height for all break types (not just forced-page breaks).Four fixes
A) measuring-dom: Aptos / Calibri natural-line calibration
JSDOM's Canvas cannot read Microsoft's Aptos/Calibri font files, so
fontBoundingBoxyields ~17 px at 12 pt — short of Word's ~19.5 px. Per-font calibration table infontMetricsCache.ts:Populates a new
naturalSingleLinefield onFontMetricsResult. Fonts without calibration fall back to canvas metrics.B) pm-adapter: TableGrid cascade override in table paragraphs
Word's Normal template sets
TableGrid.paragraphProperties.spacing = { line: 240, lineRule: 'auto' }(≈ 1.0 multiplier), but our style cascade inherits docDefaults' 1.15 multiplier instead. Extracted asapplyTableGridSpacingCascade(paragraphBlocks, sourceChildNode): when a table-cell paragraph has no explicitw:spacing, forcesspacing.line = 1.0.C) measuring-dom: multiplier × naturalSingleLine (spec-correct)
Per §17.18.48 a multiplier line-spacing value scales the font's natural single line height, not fontSize. Previously
multiplier × fontSizecoincidentally worked for fonts wherenaturalSingle ≈ 1.15 × fontSize, but was wrong for Aptos/Calibri.resolveLineHeightnow scales by the calibratednaturalSingle.D) layout-engine: skip empty sectPr-marker paragraphs for all section break types
Per ECMA-376 §17.6.17 a
<w:sectPr>inside a paragraph's<w:pPr>defines where the section ends — the enclosing paragraph is a section-properties container, not a visible paragraph. We already skipped these markers for forced-page breaks (nextPage/evenPage/oddPage). For continuous breaks the marker rendered at full docDefaults height (~35 px body + spacing-after), costing a row per column on page 1 of IT-945. Extended the skip to cover continuous breaks too.Pre-fix vs post-fix behaviour
lineRule="auto"+ multipliermax(mult × fontSize, 1.15×fontSize)mult × naturalSingle(no min-clamp)lineRule="atLeast"+ multipliermax(mult × fontSize, 1.15×fontSize)max(mult × naturalSingle, naturalSingle)lineRule="exact"+ multipliermult × fontSizemult × naturalSingleauto(e.g. 0.5×)0.5 × naturalSinglesectPr-marker paragraph, continuous breaktblStyleline=240(1.0) applied when no explicitw:spacingBrowser-verified on IT-945-numbered.docx
Structural parity within ±1 row per column; same 2-page layout; column-balance ratio matches to 0.2%.
Test plan
pnpm --filter @superdoc/measuring-dom test→ 251/251 pass.pnpm --filter @superdoc/pm-adapter test→ 1739/1739 pass.bun test ./packages/layout-engine/layout-engine/src/→ 150/151 pass (1 pre-existing skip).skips empty sectPr marker paragraph before continuous section break (SD-2735).fontMetricsCache.test.ts.resolveLineHeightbaseline tests updated for multiplier × naturalSingleLine.it-945-numbered.docx.Why still draft
Open for review on the spec reading (§17.18.48 + §17.6.17), the TableGrid override strategy (currently lives in pm-adapter; could migrate to style-engine), and the sectPr-marker skip extension.