refactor: introduce QuoteState and ContextStack for unified parsing state#321
Merged
refactor: introduce QuoteState and ContextStack for unified parsing state#321
Conversation
Adds QuoteState class that encapsulates single/double quote state
tracking with stack support for nested contexts. Refactors ~15
functions to use this unified tracker instead of scattered in_single,
in_double, in_single_quote, in_double_quote variables.
Key improvements:
- Single source of truth for quote state via QuoteState class
- push()/pop() methods for nested contexts (e.g., ${...} inside quotes)
- outer_double() method to peek at parent context
- in_quotes() helper to check any quote state
- process_char() for standard quote character handling
Functions refactored to use QuoteState:
- _strip_line_continuations_comment_aware
- _find_cmdsub_end
- Word._double_ctlesc_smart
- Word._normalize_param_expansion_newlines
- Word._expand_all_ansi_c_quotes
- Word._format_command_substitutions
- Word._normalize_extglob_whitespace
- Parser._is_assignment_word
- Parser._param_subscript_has_close
- Parser._consume_param_name (subscript handling)
- And several nested quote tracking contexts
This is Phase 1 of a larger refactoring toward bash's parser model.
Adds ParseContext and ContextStack classes to provide infrastructure for managing nested parsing contexts. This replaces scattered state variables with an explicit stack-based model. ParseContext tracks: - Context kind (NORMAL, COMMAND_SUB, ARITHMETIC, CASE_PATTERN, BRACE_EXPANSION) - Paren/brace/bracket depths - Quote state (via QuoteState) ContextStack provides: - push(kind): Enter a new context - pop(): Exit current context (never pops base) - current: Access topmost context - in_context(kind): Check if context type is on stack - depth: Current stack depth The Parser now has self._ctx (ContextStack) for tracking parsing context. This infrastructure enables incremental migration of scattered state variables like case_depth, arith_depth, etc. to the context stack model. This is Phase 2 of the architectural refactoring toward bash's parser model.
Extracts the arithmetic validation logic from _find_cmdsub_end into a standalone helper function _is_valid_arithmetic_start(). This: - Checks if $(( at a position starts valid arithmetic expression - Scans forward looking for )) at top paren level (excluding nested $()) - Returns True for arithmetic, False for $( ( ... ) ) (cmdsub + subshell) The helper makes the code more readable and documents the pattern for distinguishing $((...)) arithmetic from $( ( ... ) ) command substitution. This is Phase 3 of the architectural refactoring toward bash's parser model.
Adds three standard lookahead methods to Parser class: - peek_at(offset): Peek at character at offset from current position, returns empty string if out of bounds - lookahead(n): Return next n characters without consuming - match_keyword(keyword): Check if current position matches keyword with word boundary These helpers replace ad-hoc patterns like: self.pos + 1 < self.length and self.source[self.pos + 1] == "\n" with cleaner: self.peek_at(1) == "\n" Refactored skip_whitespace and skip_whitespace_and_newlines to demonstrate the new peek_at helper usage. This is Phase 5 of the architectural refactoring toward bash's parser model.
The transpiler doesn't support @Property decorators, so convert: - QuoteState.depth -> QuoteState.get_depth() - ContextStack.current -> ContextStack.get_current() - ContextStack.depth -> ContextStack.get_depth()
The transpiler produces self-contained code, so imports are not allowed. Adds checks for both 'import x' and 'from x import y' statements.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Incrementally introduces bash's architectural patterns for cleaner parsing state management:
QuoteState class: Unified quote tracking with stack support for nested contexts. Replaces ~80 scattered
in_single/in_doublevariable instances across ~15 functions.ParseContext & ContextStack: Infrastructure for tracking nested parsing scopes (command substitutions, arithmetic, case patterns, brace expansions). Parser now has
self._ctxfor context-aware parsing._is_valid_arithmetic_start()helper: Extracts arithmetic validation logic to distinguish$((...))from$( ( ... ) ).Standard lookahead helpers: Adds
peek_at(),lookahead(), andmatch_keyword()to Parser for cleaner boundary checks.Style checker enhancement: Adds import statement checks (imports not allowed in self-contained transpiled code).
All 4515 tests pass in both Python and JavaScript.