Skip to content

fix(measuring-dom): honor w:lineRule=auto per ECMA-376 §17.18.48 (SD-2735)#2932

Open
tupizz wants to merge 6 commits intotadeu/sd-2452-feature-implement-column-balancing-for-continuous-sectionfrom
tadeu/sd-2735-measure-line-height-spec-compliance
Open

fix(measuring-dom): honor w:lineRule=auto per ECMA-376 §17.18.48 (SD-2735)#2932
tupizz wants to merge 6 commits intotadeu/sd-2452-feature-implement-column-balancing-for-continuous-sectionfrom
tadeu/sd-2735-measure-line-height-spec-compliance

Conversation

@tupizz
Copy link
Copy Markdown
Contributor

@tupizz tupizz commented Apr 24, 2026

Summary

Draft. Fixes SD-2735 — line-height spec compliance per ECMA-376 §17.18.48 plus related section-marker paragraph rendering (§17.6.17).

Stacked on top of #2930 (SD-2646).

Commits

  1. fix(measuring-dom): do not max-clamp auto lineRule per ECMA-376 §17.18.48 — minimal spec correction.
  2. fix(layout-engine): spec-compliant line-height resolution (SD-2735) — Aptos/Calibri calibration, TableGrid cascade, multiplier × naturalSingle.
  3. refactor(sd-2735): tighten comments and extract TableGrid cascade helper — behavior-preserving cleanup.
  4. fix(layout-engine): skip empty sectPr marker before any section break (SD-2735) — §17.6.17 section-properties-container paragraphs collapse to zero height for all break types (not just forced-page breaks).

Four fixes

A) measuring-dom: Aptos / Calibri natural-line calibration

JSDOM's Canvas cannot read Microsoft's Aptos/Calibri font files, so fontBoundingBox yields ~17 px at 12 pt — short of Word's ~19.5 px. Per-font calibration table in fontMetricsCache.ts:

Font Multiplier
Aptos, Aptos Display 1.218
Calibri, Calibri Light 1.219
Cambria 1.219

Populates a new naturalSingleLine field on FontMetricsResult. Fonts without calibration fall back to canvas metrics.

B) pm-adapter: TableGrid cascade override in table paragraphs

Word's Normal template sets TableGrid.paragraphProperties.spacing = { line: 240, lineRule: 'auto' } (≈ 1.0 multiplier), but our style cascade inherits docDefaults' 1.15 multiplier instead. Extracted as applyTableGridSpacingCascade(paragraphBlocks, sourceChildNode): when a table-cell paragraph has no explicit w:spacing, forces spacing.line = 1.0.

C) measuring-dom: multiplier × naturalSingleLine (spec-correct)

Per §17.18.48 a multiplier line-spacing value scales the font's natural single line height, not fontSize. Previously multiplier × fontSize coincidentally worked for fonts where naturalSingle ≈ 1.15 × fontSize, but was wrong for Aptos/Calibri. resolveLineHeight now scales by the calibrated naturalSingle.

D) layout-engine: skip empty sectPr-marker paragraphs for all section break types

Per ECMA-376 §17.6.17 a <w:sectPr> inside a paragraph's <w:pPr> defines where the section ends — the enclosing paragraph is a section-properties container, not a visible paragraph. We already skipped these markers for forced-page breaks (nextPage/evenPage/oddPage). For continuous breaks the marker rendered at full docDefaults height (~35 px body + spacing-after), costing a row per column on page 1 of IT-945. Extended the skip to cover continuous breaks too.

Pre-fix vs post-fix behaviour

Scenario Pre-fix Post-fix
lineRule="auto" + multiplier max(mult × fontSize, 1.15×fontSize) mult × naturalSingle (no min-clamp)
lineRule="atLeast" + multiplier max(mult × fontSize, 1.15×fontSize) max(mult × naturalSingle, naturalSingle)
lineRule="exact" + multiplier mult × fontSize mult × naturalSingle
Sub-baseline auto (e.g. 0.5×) silently floored at 1.15× renders as 0.5 × naturalSingle
sectPr-marker paragraph, continuous break rendered at docDefaults height (~35 px) skipped (zero height), matches Word
Table-cell paragraph under tblStyle inherits docDefaults 1.15 multiplier TableGrid line=240 (1.0) applied when no explicit w:spacing

Browser-verified on IT-945-numbered.docx

Word SuperDoc (this PR)
Row height ~19.5 px 19.49 px
Page 1 rows per column 41 / 41 42 / 42
Page 2 rows per column 17 / 15 16 / 14
Total row count 114 114
Column balance ratio page 2 53.1 / 46.9 53.3 / 46.7

Structural parity within ±1 row per column; same 2-page layout; column-balance ratio matches to 0.2%.

Test plan

  • pnpm --filter @superdoc/measuring-dom test251/251 pass.
  • pnpm --filter @superdoc/pm-adapter test1739/1739 pass.
  • bun test ./packages/layout-engine/layout-engine/src/150/151 pass (1 pre-existing skip).
  • New regression: skips empty sectPr marker paragraph before continuous section break (SD-2735).
  • Existing forced-page-break marker test still passes.
  • Five calibration tests in fontMetricsCache.test.ts.
  • resolveLineHeight baseline tests updated for multiplier × naturalSingleLine.
  • Browser-verified on it-945-numbered.docx.

Why still draft

Open for review on the spec reading (§17.18.48 + §17.6.17), the TableGrid override strategy (currently lives in pm-adapter; could migrate to style-engine), and the sectPr-marker skip extension.

tupizz added 3 commits April 23, 2026 21:14
Per ECMA-376 §17.6.17, a <w:sectPr> inside a paragraph defines the section
that ENDS with that paragraph. All body children preceding it — paragraphs,
tables, top-level drawings, SDTs — belong to that section.

Section ranges were indexed purely by paragraph count, and section-break
blocks were emitted only inside handleParagraphNode. A table that sat
between two sectPr-marker paragraphs was emitted into the flow stream
BEFORE the section break that declared its column config, so the layout
engine laid it out under the prior section's settings.

This is the root cause of IT-945 rendering a 114-row 2-col continuous
table in column 0 across three pages with column 1 empty: the table was
placed in the 1-col section, not the 2-col section.

Fix:
- Track nodeIndex over every top-level doc.content child in
  findParagraphsWithSectPr and SectionRange (alongside paragraphIndex,
  which SDT handlers still use for intra-SDT transitions).
- Add maybeEmitNextSectionBreakForNode in sections/breaks.ts and call
  it from internal.ts's main dispatch loop BEFORE every top-level
  handler. Any non-paragraph node crossing a section boundary now
  triggers the break.
- Section-model primer in pm-adapter/README.md with spec citations.

Tests: 1739/1739 pass in pm-adapter (including new end-tagged.test.ts
and integration test in index.test.ts asserting flow-block order).
…ng section-final page (SD-2646)

The column balancer treats each fragment as an atomic block. A
multi-page two-column continuous section's final page can end up with
a single table fragment taller than totalSectionHeight / columnCount.
The atomic-block binary search then places the whole table in one
column and leaves the other empty — diverging from Word, which
balances by splitting the table at a row boundary per ECMA-376
§17.18.77 ("a continuous section break balances the content of the
previous section").

Fix: add splitDominantTableAtRowBoundary as a preprocessor inside
balanceSectionOnPage. When the section has a single splittable table
fragment larger than target, split it at the row whose cumulative
height first meets or exceeds totalSectionHeight / columnCount. The
two halves are inserted in place of the original; the rest of the
balancer runs unchanged and naturally assigns one to each column.

Also add getBalancingHeight so empty sectPr-marker paragraphs
(measured lines with width=0) contribute 0 to balancing — matching
Word's behavior of not rendering an empty line for such markers.
This keeps both columns top-aligned on the section-final page.

On IT-945: page 2 now splits 14/14 from y=96 in both columns, matching
Word's top-alignment. Before this fix page 2 rendered all 28 remaining
rows in col 1 with col 0 empty.

Tests: strengthened existing "balances the section-ending page" test
(it was passing trivially via `if (sectionFragments.length > 1)`
guard). Added narrow-table multi-page regression test. 616/616 pass.
…8.48 (SD-2735)

`resolveLineHeight` treated `w:lineRule="auto"` and `w:lineRule="atLeast"`
identically — both took the max-clamp branch that floored the line height
at `max(computedHeight, ascent+descent, 1.15×fontSize)`. Per §17.18.48,
`auto` has "no predetermined minimum or maximum"; only `atLeast` is
max-clamped.

Symptom: author intent like `w:line="120" w:lineRule="auto"` (half line
spacing) was silently inflated up to the 1.15×fontSize baseline,
producing paragraphs that visually render as single-spaced instead of
tight-spaced as specified.

Fix: branch only on `atLeast` for the max-clamp; `auto` and `exact`
pass the resolved target through without a floor. Exact behaviour per
spec table:

  | lineRule | result                                        |
  |----------|-----------------------------------------------|
  | auto     | target (no clamp)                             |
  | atLeast  | max(target, naturalSingleLine)                |
  | exact    | target (clipped by caller if content taller)  |

Out of scope in this PR (tracked in SD-2735):

1. `lineUnit='multiplier'` currently means `multiplier × fontSize` per
   the adapter's convention. Strict §17.18.48 reading would multiply
   by naturalSingleLine instead. Changing this requires coordinated
   adapter + test-baseline updates and shifts layout numbers across
   every document — kept for follow-up.

2. Natural single-line height on the browser side does not match
   Word's Aptos-intrinsic metrics (browser computes ~17-18 px for
   12pt, Word renders ~19.5 px). Closing that gap requires bundling
   Aptos as a web font so the browser uses Microsoft's actual font
   metrics — also follow-up.

3. TableGrid's pPr override (`w:line="240"` / single-spacing for table
   cells) isn't applied to paragraphs inside the table by the
   pm-adapter / style-engine cascade. Separate investigation.

New regression test: `honours auto lineRule with a sub-baseline
multiplier` asserts a 0.5 multiplier produces 8 px (not the inflated
18.4) at fontSize=16. All 246 measuring-dom tests pass.

Downstream: layout-engine (616/616) and pm-adapter (1739/1739) test
suites pass unchanged.
@linear
Copy link
Copy Markdown

linear Bot commented Apr 24, 2026

Three coordinated fixes for ECMA-376 §17.18.48 line-height semantics:

A) measuring-dom: Aptos/Calibri natural-line calibration

JSDOM's Canvas API cannot read Microsoft's Aptos/Calibri font files, so
the fallback fontBoundingBox measurement yields ~17px for 12pt — well
short of Word's ~19.5px. Adds a per-font calibration table
(Aptos/Calibri/Cambria = 1.218x fontSize) populating the new
`naturalSingleLine` field on FontMetricsResult. Five new tests cover
the calibration table.

B) pm-adapter: TableGrid cascade override in table paragraphs

TableGrid style in Word's Normal template sets `w:line="240" auto`
(1.0 multiplier), but our style cascade inherits docDefaults' 1.15
multiplier instead. When a paragraph inside a table has no explicit
`w:spacing`, override the multiplier to 1.0 so table rows render at
Word's compact spacing.

C) measuring-dom: multiplier × naturalSingleLine (spec-correct)

Per §17.18.48, `multiplier × (w:line / 240)` scales the *natural single
line height*, not fontSize. Previously we multiplied by fontSize (a
coincidentally-workable approximation for fonts where
naturalSingle ≈ 1.15 × fontSize). Now we scale by the calibrated
naturalSingle. Baseline tests updated to reflect the new arithmetic:
e.g. `1.5 × auto` at 16px goes from 24 to 27.6 (1.5 × 1.15 × 16).

Tests:
- measuring-dom: 251/251 pass (5 new calibration tests)
- pm-adapter:   1739/1739 pass
- layout-engine: 615/616 pass (1 pre-existing skip)

Remaining work (tracked separately): theme-font resolution
(`minorHAnsi` → Aptos) must reach the measurement layer for the Aptos
calibration to take effect at render-time for documents that use theme
fonts. Today the DOM reports `fontFamily: Arial, sans-serif` for
IT-945, so calibration is plumbed but does not yet fire for this
corpus. The fixes themselves are spec-correct and unit-tested.
@tupizz tupizz requested a review from harbournick April 24, 2026 01:19
@tupizz tupizz force-pushed the tadeu/sd-2735-measure-line-height-spec-compliance branch from 23aed51 to a14dcd1 Compare April 24, 2026 01:44
- fontMetricsCache.ts: trim JSDoc on FontMetricsResult and the
  calibration table to the non-obvious bits (why the multiplier table
  exists; what undefined means for callers).
- measuring-dom/index.ts (resolveLineHeight): replace the historical-
  context prose with a compact rule table + the one non-obvious fact
  about the fallback floor. Collapse the multiplier-vs-explicit branch.
- pm-adapter/converters/table.ts: extract the TableGrid pPr cascade
  override into a named top-level helper (applyTableGridSpacingCascade)
  so parseTableCell stays readable.

Behavior unchanged. All tests green (measuring-dom 251, pm-adapter 1739,
layout-engine 615).
@tupizz tupizz marked this pull request as ready for review April 24, 2026 01:50
@tupizz tupizz self-assigned this Apr 24, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 28bdc8fe03

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

return computedHeight;
if (!spacing || spacing.line == null) return naturalSingle;

const target = spacing.lineUnit === 'multiplier' ? spacing.line * naturalSingle : spacing.line;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop double-applying auto line-spacing multipliers

resolveLineHeight now treats spacing.line as a pure OOXML ratio (line/240) and multiplies it by naturalSingle, but pm-adapter still emits auto spacing values with an extra 1.15 factor baked in (see normalizeLineValue in spacing-indent.ts). This means imported auto spacing is scaled twice: e.g. w:line=360 becomes 1.725, then this code computes 1.725 × 18.4 = 31.7px instead of the expected 27.6px (1.5 lines at 16px base). So explicit auto line spacing in DOCX will render too tall after this change.

Useful? React with 👍 / 👎.

if (block.kind !== 'paragraph' || !block.attrs) continue;
const spacing = block.attrs.spacing;
if (spacing?.lineUnit === 'multiplier' && spacing.lineRule === 'auto') {
block.attrs.spacing = { ...spacing, line: 1.0 };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict TableGrid override to the actual TableGrid style

This hard-codes spacing.line = 1.0 for every table-cell paragraph that has lineUnit='multiplier' and lineRule='auto', and it is invoked whenever a table has any tableStyleId (not just TableGrid). As a result, non-TableGrid/custom table styles with intentional paragraph line multipliers are silently flattened to 1.0, overriding style-engine output and changing row heights for styled tables.

Useful? React with 👍 / 👎.

… (SD-2735)

Per ECMA-376 §17.6.17, a <w:sectPr> inside a paragraph's <w:pPr>
defines where the section ends — the enclosing paragraph is a section-
properties container, not a visible paragraph. Word renders these
marker paragraphs at zero height regardless of section-break type.

Previously we only skipped the marker when the next break forced a page
boundary (nextPage/evenPage/oddPage). For continuous breaks the marker
rendered at full docDefaults height (~35 px body + spacing-after),
which on IT-945 cost a row per column on page 1 (the intro area
consumed 72 px vs Word's 33 px, losing ~20 px of column budget).

Extends the skip to continuous section breaks too.

Browser-verified on it-945-numbered.docx: page 1 now balances
42+42 rows, page 2 16+14, matching Word's structural layout (41+41 /
17+15) within a single-row margin.

Tests:
- New regression: 'skips empty sectPr marker paragraph before continuous
  section break (SD-2735)' asserts the marker and its full-height
  measure don't appear in the layout.
- Existing forced-page-break test unchanged.
- layout-engine 150/150 pass; pm-adapter 1739/1739; measuring-dom 251/251.
@harbournick
Copy link
Copy Markdown
Collaborator

@luccas-harbour mind reviewing this one pls?

Base automatically changed from tadeu/sd-2646-balance-earlier-pages-of-multi-page-2col-section to tadeu/sd-2452-feature-implement-column-balancing-for-continuous-section April 30, 2026 18:07
@harbournick harbournick requested a review from a team as a code owner April 30, 2026 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants