markdown-content-parity: revisit baseline NOISE_PATTERNS now that audience segmentation has escape hatches

## Background

The `markdown-content-parity` check now has formal escape hatches for intentional content divergence:

- `data-markdown-ignore` HTML attribute
- `--parity-exclusions` CSS selectors
- Configurable thresholds (including 0/0 informational mode)

This is documented in `docs/checks/observability.md` under "Audience segmentation" and "Configuring parity."

The baseline `NOISE_PATTERNS` list in `src/checks/observability/markdown-content-parity.ts` predates these escape hatches. It currently filters specific text segments before comparison:

```
/^last updated/,
/^was this page helpful/,
/^thank you for your feedback/,
/^previous\s+\S.*next\s+\S/,    // pagination
/^start from the beginning$/,
/^join our .* server/,
/^loading video content/,
/^\/.+\/.+/,                     // breadcrumb paths
/^for ai agents:/,               // agent-directive banner
```

The list grew organically: each entry was added to fix a specific site's chrome leaking into parity comparisons. With the escape hatches now available, the question is whether these baseline filters are still appropriate or whether some should move to site responsibility.

## The fairness problem

The current baseline favors sites whose chrome happens to use the filtered phrasing over sites that don't. For example, `^was this page helpful` matches some platforms but not others, since equivalent widgets use varied phrasing ("Did this page help you?", "Was this useful?", star ratings, thumbs up/down). Adding more phrasings to the baseline shifts the inconsistency rather than resolves it.

The same issue applies to:
- `^join our .* server` matches Discord CTAs but not Slack/Telegram/etc.
- `^for ai agents:` matches one specific agent-directive banner phrasing.

## Proposal

### Part 1: Document `parityExclusions` recipes

Add a "Common chrome patterns" section to `docs/checks/observability.md` under `markdown-content-parity`. Provide ready-to-use `parityExclusions` selectors for chrome that varies in phrasing across sites:

- Feedback widgets (`[class*=\"feedback\"]`, `[aria-label*=\"feedback\"]`, `.was-this-helpful`, etc.)
- \"View as markdown\" agent directive links
- Cookie banners and privacy bars
- Community CTAs (Discord/Slack invitations)
- Footer link bars (Edit this page, Report a bug, etc.)

This gives site owners a clear path: when chrome inflates their parity score, copy a recipe instead of needing the baseline updated for their phrasing.

### Part 2: Decide on baseline shrinkage

Two consistent positions:

**A. Keep only universal web infrastructure in baseline** (recommended)

These are essentially impossible for a site to opt out of, and aren't really content:

- `^last updated/`
- `^previous\s+\S.*next\s+\S/` (pagination)
- `^/.+/.+/` (breadcrumb paths — though this regex is fragile)
- `^loading video content/`

Move the rest to site responsibility (their phrasing varies, so the baseline is biased toward sites using one specific wording):

- `^was this page helpful/`
- `^thank you for your feedback/`
- `^join our .* server/`
- `^start from the beginning$/`
- `^for ai agents:/`

Site owners use `parityExclusions` recipes from Part 1.

**B. Leave baseline as-is, don't grow it**

Keep all current entries (sites already passing depend on them), but adopt a policy: no new patterns. Future requests for additional phrasings are answered with \"use `parityExclusions`.\"

This is less consistent but lower-disruption for sites currently passing under the baseline.

## Decision needed

1. Is shrinking the baseline (option A) desirable, or is the migration risk too high (option B)?
2. Either way, document `parityExclusions` recipes so the escape hatch is discoverable.

This issue is to make the decision and execute it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

markdown-content-parity: revisit baseline NOISE_PATTERNS now that audience segmentation has escape hatches #87

Background

The fairness problem

Proposal

Part 1: Document `parityExclusions` recipes

Part 2: Decide on baseline shrinkage

Decision needed

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

markdown-content-parity: revisit baseline NOISE_PATTERNS now that audience segmentation has escape hatches #87

Description

Background

The fairness problem

Proposal

Part 1: Document parityExclusions recipes

Part 2: Decide on baseline shrinkage

Decision needed

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Part 1: Document `parityExclusions` recipes