From ec934dd7f81dffe3848c5b58b5f87100372062bf Mon Sep 17 00:00:00 2001
From: Felix <188768+fb55@users.noreply.github.com>
Date: Thu, 19 Mar 2026 11:17:56 +0000
Subject: [PATCH] docs: expand README
Document all parser events, options, and common workflows (searching,
modifying, serializing the DOM)
Closes #1765
---
README.md | 133 +++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 112 insertions(+), 21 deletions(-)
diff --git a/README.md b/README.md
index c1230716..47b22bac 100644
--- a/README.md
+++ b/README.md
@@ -83,8 +83,37 @@ JS! Hooray!
That's it?!
```
-This example only shows three of the possible events.
-Read more about the parser, its events and options in the [wiki](https://github.com/fb55/htmlparser2/wiki/Parser-options).
+### Parser events
+
+All callbacks are optional. The handler object you pass to `Parser` may implement any subset of these:
+
+| Event | Description |
+| -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `onopentag(name, attribs, isImplied)` | Opening tag. `attribs` is an object mapping attribute names to values. `isImplied` is `true` when the tag was opened implicitly (HTML mode only). |
+| `onopentagname(name)` | Emitted for the tag name as soon as it is available (before attributes are parsed). |
+| `onattribute(name, value, quote)` | Attribute. `quote` is `"` / `'` / `null` (unquoted) / `undefined` (no value, e.g. `disabled`). |
+| `onclosetag(name, isImplied)` | Closing tag. `isImplied` is `true` when the tag was closed implicitly (HTML mode only). |
+| `ontext(data)` | Text content. May fire multiple times for a single text node. |
+| `oncomment(data)` | Comment (content between ``). |
+| `oncdatastart()` | Opening of a CDATA section (``). |
+| `onprocessinginstruction(name, data)` | Processing instruction (e.g. ``). |
+| `oncommentend()` | Fires after a comment has ended. |
+| `onparserinit(parser)` | Fires when the parser is initialized or reset. |
+| `onreset()` | Fires when `parser.reset()` is called. |
+| `onend()` | Fires when parsing is complete. |
+| `onerror(error)` | Fires on error. |
+
+### Parser options
+
+| Option | Type | Default | Description |
+| ------------------------ | --------- | ---------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `xmlMode` | `boolean` | `false` | Treat the document as XML. This affects entity decoding, self-closing tags, CDATA handling, and more. Set this to `true` for XML, RSS, Atom and RDF feeds. |
+| `decodeEntities` | `boolean` | `true` | Decode HTML entities (e.g. `&` -> `&`). |
+| `lowerCaseTags` | `boolean` | `!xmlMode` | Lowercase tag names. |
+| `lowerCaseAttributeNames`| `boolean` | `!xmlMode` | Lowercase attribute names. |
+| `recognizeSelfClosing` | `boolean` | `xmlMode` | Recognize self-closing tags (e.g. `
`). Always enabled in `xmlMode`. |
+| `recognizeCDATA` | `boolean` | `xmlMode` | Recognize CDATA sections as text. Always enabled in `xmlMode`. |
### Usage with streams
@@ -106,25 +135,100 @@ htmlStream.pipe(parserStream).on("finish", () => console.log("done"));
## Getting a DOM
-The `DomHandler` produces a DOM (document object model) that can be manipulated using the [`DomUtils`](https://github.com/fb55/DomUtils) helper.
+The `parseDocument` helper parses a string and returns a DOM tree (a [`Document`](https://github.com/fb55/domhandler) node).
```js
import * as htmlparser2 from "htmlparser2";
-const dom = htmlparser2.parseDocument(htmlString);
+const dom = htmlparser2.parseDocument(
+ `
Hello