Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Changes — fix/frontmatter-and-pdf-layout

## Summary

Five fixes to `lovstudio-any2pdf/scripts/md2pdf.py` addressing blank pages,
header title mismatch, code block alignment, and a new frontmatter feature
that makes markdown files self-contained.

---

## 1. YAML Frontmatter Support (new feature)

**Problem:** All document metadata (title, author, theme, etc.) had to be
passed as CLI arguments. The markdown file itself carried no metadata, making
it hard to reproduce a build without remembering the exact command.

**Fix:** The script now parses a YAML-style frontmatter block at the top of
the markdown file:

```markdown
---
title: My Document
subtitle: Version 1.0 · Platform: Linux
author: Acme Corp
footer-left: Acme Corp
copyright: © Acme Corp
theme: ieee-journal
watermark: DRAFT
---

## Chapter 1
...
```

All CLI parameters are supported as frontmatter keys (using the same names as
the `--` flags, e.g. `footer-left`, `code-max-lines`). **CLI arguments always
take precedence** over frontmatter values, so existing workflows are unaffected.

With frontmatter, the minimal invocation becomes:

```bash
python md2pdf.py --input report.md --output report.pdf
```

---

## 2. Blank Page After TOC (bug fix)

**Problem:** When a markdown file contains a `# Title` heading and a cover
page is generated via `--title` (or frontmatter), the H1 heading produces a
full chapter-divider page (PageBreak + large Spacer + title Paragraph +
decorations). This page appeared between the TOC and the first `##` chapter,
creating a blank-looking page.

**Fix:** When a cover title is provided, strip all visual elements generated
by the H1 heading (Spacer, title Paragraph, decoration flowables, and the
following H2's leading PageBreak). The H1 `ChapterMark` flowable is **kept**
so TOC anchor links remain valid.

The stripping only triggers when:
- `cover=True` (default), AND
- The first `ChapterMark` in `story_content` is level 0 (H1)

Documents without a `# Title` heading are unaffected.

---

## 3. Header Title One-Page Lag (bug fix)

**Problem:** The right side of the page header displayed the current chapter
name via `_cur_chapter[0]`. Because reportlab's `onPage` callback fires
*before* the page's flowables are rendered, `_cur_chapter[0]` always held the
*previous* page's chapter — causing a one-page lag (e.g. page 3 showed
"Chapter 1" while its content was already "Chapter 2").

**Affected locations:**
- `_draw_page_decoration()` — `top-band` style (used by `ieee-journal`)
- `_normal_page()` — `full` header style

**Fix:** Replace `_cur_chapter[0]` with the fixed document title
(`self.cfg.get("title", "")`). The header now consistently shows the document
title on every page, which is the correct behaviour for a technical manual or
report. Dynamic chapter tracking would require a two-pass build to resolve
correctly; the fixed title is the pragmatic solution.

---

## 4. Code Block Mid-Line Space Collapsing (bug fix)

**Problem:** `esc_code()` only converted *leading* spaces to ` `. Spaces
in the middle of a line (e.g. padded columns in ASCII diagrams, table output,
or aligned assignments) were left as regular HTML spaces and collapsed to a
single space by reportlab's Paragraph renderer.

**Before:**
```python
stripped = e.lstrip(' ')
indent = len(e) - len(stripped)
out.append(' ' * indent + stripped)
```

**After:**
```python
out.append(e.replace(' ', ' '))
```

This preserves both indentation and mid-line alignment for all code blocks.

---

## 5. H2 Chapter Spacer Reduction (style fix)

**Problem:** Each `##` heading inserted `Spacer(1, self.body_h * 0.30)` —
30% of the page height (~74 mm on A4). This is appropriate for book-style
chapter openers but creates excessive whitespace in technical manuals and
reports where chapters are short and numerous.

**Fix:** Changed to a fixed `Spacer(1, 8*mm)`, which provides a clean visual
break without wasting half a page.

---

## Usage Example

```markdown
---
title: xxx manual
subtitle: Version 0.1.0 · Platform: Linux
author: Acme Biotech Ltd.
footer-left: Acme Biotech Ltd.
copyright: © Acme Biotech Ltd.
theme: ieee-journal
---

## 1. Overview
...
```

```bash
# With uv (recommended — no pip install needed, isolated env)
uv run --with reportlab /path/to/md2pdf.py --input report.md --output report.pdf

# With pip
python md2pdf.py --input report.md --output report.pdf

# CLI args override frontmatter
python md2pdf.py --input manual.md --output manual.pdf --theme warm-academic
```
166 changes: 115 additions & 51 deletions lovstudio-any2pdf/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ compatibility: >
Linux: uses Carlito, Liberation Serif, Droid Sans Fallback, DejaVu Sans Mono.
metadata:
author: lovstudio
version: "1.0.0"
version: "1.1.0"
tags: markdown pdf cjk reportlab typesetting
---

Expand All @@ -39,25 +39,48 @@ get wrong.

## Quick Start

The recommended approach is to embed all metadata in the markdown file via YAML frontmatter, then run with a minimal command:

```markdown
---
title: My Report
author: Author Name
theme: warm-academic
---

# My Report
...
```

```bash
python md2pdf/scripts/md2pdf.py \
--input report.md \
--output report.pdf \
--title "My Report" \
--author "Author Name" \
--theme warm-academic
# Preferred: uv (isolated, no side effects on project env)
uv run --with reportlab lovstudio-any2pdf/scripts/md2pdf.py \
--input report.md --output report.pdf

# Fallback: pip
pip install reportlab --break-system-packages
python lovstudio-any2pdf/scripts/md2pdf.py --input report.md --output report.pdf
```

All parameters except `--input` are optional — sensible defaults are applied.

## Pre-Conversion Options (MANDATORY)
## Pre-Conversion Workflow (MANDATORY)

### Step 1 — Read and Inspect the Markdown File

**IMPORTANT: You MUST use the `AskUserQuestion` tool to ask these questions BEFORE
running the conversion. Do NOT list options as plain text — use the tool so the user
gets a proper interactive prompt. Ask all options in a SINGLE `AskUserQuestion` call.**
Before asking any questions, read the user's markdown file and check:

Use `AskUserQuestion` with the following template. The tone should be friendly and
concise — like a design assistant, not a config form:
- **Frontmatter**: Does it already have a `--- ... ---` block? If yes, note which keys are already set and skip asking for those.
- **Title**: Is there a `# H1` heading? If yes, it will be used as the document title automatically.
- **Structure**: Are headings well-formed (`##`, `###`)? Are there merged headings like `# Foo## Bar` on one line? (The preprocessor handles these, but worth noting.)
- **Content hints**: Does the content suggest a particular theme (e.g. academic paper → `classic-thesis`, Chinese report → `chinese-red`, code-heavy → `github-light`)?

Report a brief summary to the user, e.g.:
> 已读取文档,共 8 个章节,检测到标题「xxx manual」,无 frontmatter。建议主题:`ieee-journal`(技术手册风格)。

### Step 2 — Ask Design Options

**IMPORTANT: Use the `AskUserQuestion` tool for this step.** Ask all options in a SINGLE call. Skip any options already covered by existing frontmatter.

```
开始转 PDF!先帮你确认几个选项 👇
Expand Down Expand Up @@ -92,21 +115,35 @@ concise — like a design assistant, not a config form:
直接说人话就行,不用记编号 😄
```

### Mapping User Choices to CLI Args
### Step 3 — Write Frontmatter into the Markdown File

After collecting user choices, **edit the markdown file directly** to prepend a frontmatter block (or update the existing one). Do NOT pass options as CLI args — frontmatter keeps the document self-contained and reproducible.

| Choice | CLI arg |
|--------|---------|
| Design style a-j | `--theme` with value from table below |
| Frontispiece local | `--frontispiece <path>` |
| Frontispiece AI | Generate image first, then `--frontispiece /tmp/frontispiece.png` |
| Watermark text | `--watermark "文字"` |
| Back cover image | `--banner <path>` |
| Back cover text | `--disclaimer "声明"` and/or `--copyright "© 信息"` |
Example frontmatter to write:

```markdown
---
title: xxx manual
author: Acme Biotech Ltd.
footer-left: Acme Biotech Ltd.
theme: ieee-journal
watermark: DRAFT
frontispiece: /tmp/frontispiece.png
copyright: © Acme Biotech Ltd.
---
```

Then run the minimal command:

```bash
uv run --with reportlab /path/to/lovstudio-any2pdf/scripts/md2pdf.py \
--input report.md --output report.pdf
```

### Theme Name Mapping

| Choice | `--theme` value | Inspiration |
|--------|----------------|-------------|
| Choice | `theme` value | Inspiration |
|--------|--------------|-------------|
| a) 暖学术 | `warm-academic` | Lovstudio design system |
| b) 经典论文 | `classic-thesis` | LaTeX classicthesis |
| c) Tufte | `tufte` | Edward Tufte's books |
Expand All @@ -122,7 +159,7 @@ concise — like a design assistant, not a config form:

If user chose AI generation: read the document title + first paragraphs, use an
image generation tool to create a themed illustration matching the chosen design
style, show for approval, then pass via `--frontispiece /path/to/image.png`
style, show for approval, then add `frontispiece: /path/to/image.png` to frontmatter.

## Architecture

Expand All @@ -134,7 +171,7 @@ Key components:
1. **Font system**: Palatino (Latin body), Songti SC (CJK body), Menlo (code) on macOS; auto-fallback on Linux
2. **CJK wrapper**: `_font_wrap()` wraps CJK character runs in `<font>` tags for automatic font switching
3. **Mixed text renderer**: `_draw_mixed()` handles CJK/Latin mixed text on canvas (cover, headers, footers)
4. **Code block handler**: `esc_code()` preserves indentation and line breaks in reportlab Paragraphs
4. **Code block handler**: `esc_code()` preserves indentation, mid-line alignment, and line breaks in reportlab Paragraphs (all spaces → `&nbsp;`)
5. **Smart table widths**: Proportional column widths based on content length, with 18mm minimum
6. **Bookmark system**: `ChapterMark` flowable creates PDF sidebar bookmarks + named anchors
7. **Heading preprocessor**: `_preprocess_md()` splits merged headings like `# Part## Chapter` into separate lines
Expand Down Expand Up @@ -162,33 +199,52 @@ Default reportlab breaks lines only at spaces, causing ugly splits like "Claude\
`drawString()` / `drawCentredString()` with a Latin font can't render 年/月/日 etc.
**Fix**: Use `_draw_mixed()` for ALL user-content canvas text (dates, stats, disclaimers).

## Frontmatter Support

All parameters (except `--input`, `--output`, `--theme-file`) can be set directly in the markdown file via YAML frontmatter. CLI args always take precedence over frontmatter values.

```markdown
---
title: My Report
author: Jane Doe
date: 2026-04-17
theme: nord-frost
cover: true
toc: true
watermark: DRAFT
---

# My Report
...
```

## Configuration Reference

| Argument | Default | Description |
|----------|---------|-------------|
| `--input` | (required) | Path to markdown file |
| `--output` | `output.pdf` | Output PDF path |
| `--title` | From first H1 | Document title for cover page |
| `--subtitle` | `""` | Subtitle text |
| `--author` | `""` | Author name |
| `--date` | Today | Date string |
| `--version` | `""` | Version string for cover |
| `--watermark` | `""` | Watermark text (empty = none) |
| `--theme` | `warm-academic` | Color theme name |
| `--theme-file` | `""` | Custom theme JSON file path |
| `--cover` | `true` | Generate cover page |
| `--toc` | `true` | Generate table of contents |
| `--page-size` | `A4` | Page size (A4 or Letter) |
| `--frontispiece` | `""` | Full-page image after cover |
| `--banner` | `""` | Back cover banner image |
| `--header-title` | `""` | Report title in page header |
| `--footer-left` | author | Brand/author in footer |
| `--stats-line` | `""` | Stats on cover |
| `--stats-line2` | `""` | Second stats line |
| `--edition-line` | `""` | Edition line at cover bottom |
| `--disclaimer` | `""` | Back cover disclaimer |
| `--copyright` | `""` | Back cover copyright |
| `--code-max-lines` | `30` | Max lines per code block |
| Argument | Frontmatter Key | Default | Description |
|----------|----------------|---------|-------------|
| `--input` | — | (required) | Path to markdown file |
| `--output` | — | `output.pdf` | Output PDF path |
| `--title` | `title` | From first H1 | Document title for cover page |
| `--subtitle` | `subtitle` | `""` | Subtitle text |
| `--author` | `author` | `""` | Author name |
| `--date` | `date` | Today | Date string |
| `--version` | `version` | `""` | Version string for cover |
| `--watermark` | `watermark` | `""` | Watermark text (empty = none) |
| `--theme` | `theme` | `warm-academic` | Color theme name |
| `--theme-file` | — | `""` | Custom theme JSON file path |
| `--cover` | `cover` | `true` | Generate cover page |
| `--toc` | `toc` | `true` | Generate table of contents |
| `--page-size` | `page-size` | `A4` | Page size (A4 or Letter) |
| `--frontispiece` | `frontispiece` | `""` | Full-page image after cover |
| `--banner` | `banner` | `""` | Back cover banner image |
| `--header-title` | `header-title` | `""` | Report title in page header |
| `--footer-left` | `footer-left` | author | Brand/author in footer |
| `--stats-line` | `stats-line` | `""` | Stats on cover |
| `--stats-line2` | `stats-line2` | `""` | Second stats line |
| `--edition-line` | `edition-line` | `""` | Edition line at cover bottom |
| `--disclaimer` | `disclaimer` | `""` | Back cover disclaimer |
| `--copyright` | `copyright` | `""` | Back cover copyright |
| `--code-max-lines` | `code-max-lines` | `30` | Max lines per code block |

## Themes

Expand All @@ -199,6 +255,14 @@ Each theme defines: page background, ink color, accent color, faded text, border

## Dependencies

If `uv` is available, no installation is needed — it creates an isolated ephemeral environment on the fly:

```bash
uv run --with reportlab /path/to/lovstudio-any2pdf/scripts/md2pdf.py --input report.md --output report.pdf
```

Otherwise, install with pip:

```bash
pip install reportlab --break-system-packages
```
Loading