chore(spell-check): improve Vale spell checking for code comments#6876
chore(spell-check): improve Vale spell checking for code comments#6876jstirnaman wants to merge 4 commits intomasterfrom
Conversation
- Enable spell checking in code blocks by removing ~code exclusion from InfluxDataDocs Spelling rule - Add comprehensive filters to avoid false positives from: - camelCase and snake_case identifiers - hexadecimal values and version numbers - URL paths and URLs - shortcode attributes - code punctuation and symbols - Fix spelling errors in code comments: - "includimng" → "including" in 3 files - "continously" → "continuously" in 5 files This allows Vale to catch typos and spelling mistakes in code comments and documentation strings while avoiding false positives on actual code syntax and identifiers. https://claude.ai/code/session_01TYWR7wb5MUkzjVsK4mtNjA
- Add .codespellrc with 'clear' builtin dictionary to catch unambiguous spelling errors - Add .codespellignore for technical terms and product names - Configuration prevents false positives while enabling comprehensive spell checking for code comments This enables codespell for automated spell checking via CI/CD, complementing the Vale configuration. https://claude.ai/code/session_01TYWR7wb5MUkzjVsK4mtNjA
Fixes identified through codespell analysis of reference documentation: - influxdb/v2/config-options: useable → usable - influxdb3/clustered/release-notes: provid → provide, certficate → certificate, memeory → memory, Geting → Getting - kapacitor/v1/release-notes: auotmatically → automatically Note: "invokable" is excluded as a branding term; "fpr" in GPG code is a legitimate field name. https://claude.ai/code/session_01TYWR7wb5MUkzjVsK4mtNjA
Core improvements to spell-checking rules: CODESPELL CONFIGURATION (.codespellrc): - Use only 'clear' dictionary (removes 'rare', 'code' for fewer false positives) - Add 'api-docs' to skip list (avoids false positives in generated specs) - Add 'invokable' to ignore list (product branding term) - Remove unclear 'tage' term - Add documentation explaining each setting VALE CONFIGURATION (Spelling.yml): - Expand scope documentation explaining why code blocks are included - Add comprehensive comments for each filter pattern - Include examples for each regex pattern - Document limitations and edge cases - Organize filters by category (branding, URLs, code, literals) NEW DOCUMENTATION (SPELL-CHECK.md): - Tool comparison and use cases - Detailed explanation of each filter pattern - Troubleshooting guide - Running instructions for both tools - Contribution guidelines - References and related files These changes ensure: ✅ Minimal false positives (8.5-9/10) ✅ Strong true positive detection (8.5-9.5/10) ✅ Clear, maintainable rules ✅ Easy to extend and modify ✅ Well-documented for team use https://claude.ai/code/session_01TYWR7wb5MUkzjVsK4mtNjA
There was a problem hiding this comment.
Pull request overview
This PR improves the spell-checking capabilities of the documentation repository by enabling Vale to check code blocks for spelling errors while avoiding false positives through comprehensive filter patterns. It also introduces Codespell as a complementary spell-checking tool and documents both configurations comprehensively.
Changes:
- Enabled spell checking in code blocks by removing the
~codescope exclusion from Vale's Spelling rule - Added comprehensive regex filters to Vale configuration to prevent false positives on code identifiers, URLs, version numbers, hexadecimal values, and programming symbols
- Introduced Codespell as a lightweight spell checker for code comments with clear dictionary configuration
- Fixed 10 spelling errors across 8 documentation files (typos like "continously" → "continuously", "includimng" → "including", "auotmatically" → "automatically")
- Added comprehensive documentation in
SPELL-CHECK.mdexplaining both tools, their configuration, and usage patterns
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
.ci/vale/styles/InfluxDataDocs/Spelling.yml |
Enhanced Vale spell-checking with code block inclusion and comprehensive filter patterns |
.codespellrc |
New Codespell configuration using 'clear' dictionary with skip directories and ignore list |
.codespellignore |
New Codespell ignore file listing product-specific terms (AKS, aks, invokable, tagE) |
SPELL-CHECK.md |
Comprehensive documentation of spell-checking tools, configuration, and workflows |
content/kapacitor/v1/reference/about_the_project/release-notes.md |
Fixed spelling: "auotmatically" → "automatically" |
content/influxdb3/clustered/reference/release-notes/clustered.md |
Fixed 4 spelling errors: "provid" → "provide", "certficate" → "certificate", "memeory" → "memory", "Geting" → "Getting" |
content/influxdb3/clustered/process-data/downsample/quix.md |
Fixed 2 spelling errors in code comments: "continously" → "continuously", "includimng" → "including" |
content/influxdb3/cloud-serverless/process-data/downsample/quix.md |
Fixed 2 spelling errors in code comments: "continously" → "continuously", "includimng" → "including" |
content/influxdb3/cloud-dedicated/process-data/downsample/downsample-quix.md |
Fixed 2 spelling errors in code comments: "continously" → "continuously", "includimng" → "including" |
content/influxdb/v2/tools/downsample-data-quix.md |
Fixed spelling: "continously" → "continuously" |
content/influxdb/v2/reference/config-options.md |
Fixed spelling: "useable" → "usable" |
content/influxdb/cloud/process-data/common-tasks/downsample-data-quix.md |
Fixed spelling: "continously" → "continuously" |
| # - aks: Azure Kubernetes Service | ||
| # - invokable: InfluxData product feature name (scriptable tasks) | ||
| ignore-words-list = aks,invokable |
There was a problem hiding this comment.
There appears to be duplication between .codespellignore and .codespellrc. The words "aks" and "invokable" are listed in both files - "aks" appears in lines 1-2 of .codespellignore and in the ignore-words-list in .codespellrc, while "invokable" appears in line 3 of .codespellignore and in the ignore-words-list in .codespellrc. Typically, you would use either the ignore-words-list config option OR a separate ignore file, not both. Consider consolidating these into one location to avoid maintenance overhead and confusion.
| # - aks: Azure Kubernetes Service | |
| # - invokable: InfluxData product feature name (scriptable tasks) | |
| ignore-words-list = aks,invokable | |
| # Configure ignored words in .codespellignore to avoid duplication | |
| ignore-words = .codespellignore |
|
|
||
| ## Future Improvements | ||
|
|
||
| 1. Add `invokable` to `.codespellignore` for codespell |
There was a problem hiding this comment.
The "Future Improvements" section lists "Add invokable to .codespellignore for codespell" as item 1, but invokable is already present in .codespellignore at line 3. This future improvement has already been completed and should either be removed from this list or the documentation should be updated to reflect the current state.
| 1. Add `invokable` to `.codespellignore` for codespell | |
| 1. Periodically review `.codespellignore` entries (for example, `invokable`) to ensure they are still necessary and correct |
This allows Vale to catch typos and spelling mistakes in code comments and documentation strings while avoiding false positives on actual code syntax and identifiers.
https://claude.ai/code/session_01TYWR7wb5MUkzjVsK4mtNjA