Add Rustdoc structure processor support#5
Merged
emcd merged 6 commits intoNov 20, 2025
Conversation
Implements structure processor for extracting documentation content from Rustdoc-generated HTML pages. The processor provides: - Detection via Rustdoc-specific HTML markers (meta tags, custom elements) - Content extraction from main documentation sections - HTML to Markdown conversion preserving Rust syntax - Support for item declarations, docblocks, and code examples Implementation follows established patterns from Sphinx/MkDocs processors: - Wide parameter types, narrow return types - Immutability preferences with __.immut.Dictionary - Exception handling with proper chaining - Comprehensive type annotations All linters and type checkers pass with zero errors.
Add rustdoc to structure-extensions in general.toml to enable the structure processor for Rustdoc-generated documentation sites.
Create symlink from sources/librovore/data to ../../data to enable configuration file access during development (hatch run commands). This allows the structure processor registration system to locate the general.toml configuration file when running in development mode.
- Fix detect() to pass ParseResult instead of string to detect_rustdoc() - Fix extraction to construct absolute URLs from relative inventory URIs - Update documentation_url and content_id to use full URLs - Remove unused urllib.parse import These changes enable proper URL resolution for content extraction from Rustdoc-generated documentation sites.
Remove progress tracker and data symlink that were used during development. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements structure processor for extracting documentation content from Rustdoc-generated HTML pages. The processor provides:
Implementation follows established patterns from Sphinx/MkDocs processors:
All linters and type checkers pass with zero errors.