Turn raw Markdown into a manipulable heading tree, edit it programmatically, then emit valid Markdown again.
- Parse Markdown into a hierarchical tree of headings (levels 1–6)
- Preserve and round‑trip section body content
- Query sections via simple dot paths (e.g.
Introduction.Installation.Windows) - Add / remove sections dynamically
- Attach (merge) whole subtrees across different Markdown documents with automatic heading level adjustment
- Dump back to Markdown or visualize structure in a
tree-like ASCII output
pip install markdown-parser-pyor, for an editable install
git clone https://github.com/VarunGumma/markdown-parser-py
cd markdown-parser-py
pip install -e ./The model is minimal:
MarkdownTree
└── root (MarkdownNode level=0, title="ROOT")
├── Child heading (level=1 => '#')
│ └── Grandchild (level=2 => '##')
└── ...
Each MarkdownNode stores:
level: 0 for synthetic root; 1–6 for real headingstitle: heading textcontent: list of raw paragraph / code / list text blocks under that heading (excluding child headings)children: nested headings
from markdown_parser import MarkdownTree
doc = """
# Intro
Some intro text.
## Install
Run `pip install x`.
## Usage
Basic usage here.
### CLI
Run `tool`.
"""
tree = MarkdownTree()
tree.parse(doc)
print('\n=== Visualize ===')
tree.visualize()
print('\n=== Dump Round Trip ===')
print(tree.dump())Output (visualize):
└── # Intro
├── ## Install
└── ## Usage
└── ### CLI
node = tree.find_node_by_path('Intro.Install') # '# Intro' > '## Install'
if node:
print('Found:', node.title, 'level', node.level)Dot paths walk downward by titles. A single component path refers to a top‑level heading (level 1). Returns None if not found.
new = tree.add_section('Intro', 'Advanced', content='Deep dive coming soon.')
print('Added at level', new.level)If parent_path is "" or "ROOT", the new section becomes a top‑level heading.
tree.remove_section('Intro.Advanced') # removes that subtreeYou can merge content from another parsed Markdown document. Levels auto-adjust so the attached subtree root sits exactly one level below the chosen parent.
from markdown_parser import MarkdownTree
base = MarkdownTree()
base.parse('# A\nIntro text.')
other = MarkdownTree()
other.parse('# Extra\nStuff here.\n\n## Deep\nDetails.')
# Attach ALL top-level sections from other under 'A'
base.attach_subtree('A', other) # Equivalent to source_path=None
# Or attach only a specific subsection
# base.attach_subtree('A', other, source_path='Extra.Deep')
base.visualize()
print(base.dump())If you attach the full tree (source_path=None / 'ROOT'), each top-level section in the source is cloned with level adjusted: new_level = parent.level + original_level.
def compose(product_readme: str, appendix_md: str) -> str:
main_tree = MarkdownTree()
main_tree.parse(product_readme)
appendix_tree = MarkdownTree()
appendix_tree.parse(appendix_md)
# Ensure an Appendix section exists
if not main_tree.find_node_by_path('Appendix'):
main_tree.add_section('', 'Appendix')
# Attach all appendix top-level sections under Appendix
main_tree.attach_subtree('Appendix', appendix_tree)
return main_tree.dump()This is an early/experimental utility. Edge cases (nested fenced code blocks, Setext headings, ATX heading oddities, HTML blocks) are not fully supported yet.