Skip to content

Refactor HTML table processing to separate preprocessing step #17

@coderabbitai

Description

@coderabbitai

The current line-based buffering for HTML tables adds significant complexity to the main process_stream function with additional state variables (html_buf, html_depth, in_html).

Suggested improvement:
Extract HTML-to-Markdown conversion into a separate preprocessing step that:

  • Runs before the main table reflow logic
  • Simplifies the main processing loop
  • Improves separation of concerns
  • Makes the code more maintainable and testable

This would create a cleaner architecture where HTML tables are converted to Markdown tables first, then processed through the existing Markdown table reflow logic.

Related:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions