Skip to content

Latest commit

 

History

History
97 lines (79 loc) · 4.69 KB

File metadata and controls

97 lines (79 loc) · 4.69 KB

Implementation Plan — ShellSyntaxTree

Source of truth for active work. Translates SPEC.md §16 (bash — shipped) and SPEC.POWERSHELL.md §16 (PowerShell — v0.2.0) into NOW / NEXT / LATER buckets. Park items aggressively — autonomous loops will otherwise bulldoze priorities.

Hard rule: every PR ends with this file updated (item moved / completed / parked) so the plan reflects reality.


NOW (0.2.0 — PowerShell parser)

Spec: SPEC.POWERSHELL.md (v0.2.0). The PowerShell parser is implemented — phases 1–14 of SPEC.POWERSHELL.md §16 are complete (see below). What remains is the release flow and the downstream Netclaw integration, both of which need actions outside this repository.

Implemented (SPEC.POWERSHELL.md §16 phases 1–14) — done

  • 1. Public-API surfaceShellParserOptions base; BashParserOptions reparented; PwshParserOptions + PwshParser; additive VerbChain.CanonicalVerb / IsDynamic; Clause.IsBashCWrappedIsCommandStringWrapped. PublicApiSnapshotTests, corpus DTOs, and the bash corpus JSON key all updated.
  • 2. Verb & binding tablesPwshApprovedVerbs, PwshAliases (complete default set), PwshVerbs, PwshBindingTables, PwshPerVerbRules.
  • 3. PwshLexer — quoting, backtick escape, $var / $env: / ${name}, parameters, stream redirects, statement separators, comments, opaque regions, --%; OpaqueRegionScanner backtick mode.
  • 4–10. PwshCommandParser — pipeline / statement splitting, verb-chain extraction, the §6.5 binding model, PwshResolver (§8), per-verb path rules (§7), Set-Location propagation (§9), pwsh -Command / -EncodedCommand recursion (§10), anomaly safe-fail and the 64 KiB cap (§11).
  • 11. Multi-shell corpus runner + PII audit — directory-routed by Corpus/<shell>/.
  • 12. PowerShell corpus — 211 entries under Corpus/powershell/, every §13 category minimum exceeded.
  • 13. pwsh validation gate + tools/PwshCorpusToolPwshOracleTests enforces the §13 oracle matrix + the PwshAliases completeness [Fact]; the tool is registered in TOOLING.md.
  • 14. SPEC.md edits + CI + version bumpSPEC.md §1 / §2 / §3 / §6.4 / §15 updated; VersionPrefix0.2.0, VersionSuffixalpha; CI verifies pwsh and runs both corpora on Linux + Windows; RELEASE_NOTES.md v0.2.0 section; CLI + Web samples gain a shell selector; README.md updated.

15. Release 0.2.0 (alpha → beta → stable) — SPEC.PWSH §15 / §17

  • Tag 0.2.0-alpha; publish_nuget.yml produced ShellSyntaxTree.0.2.0-alpha.nupkg and it is live on nuget.org (released 2026-05-20).
  • 0.2.0-beta so Netclaw validates the parser + the breaking rename
  • Promote to stable 0.2.0 after Netclaw validation

16. Netclaw v0.2.0 integration — SPEC.PWSH §17 #9

  • Netclaw consumes the v0.2.0 package; absorbs the Clause rename (separate repository — cannot be done here)
  • ≥1 Netclaw integration test exercises a real PowerShell corpus entry through the live matcher and gets the expected gate decision

NEXT (0.1.x / 0.2.x — additive)

  • Seed corpus entries from sanitized real-world dogfood logs (SPEC §14 workflow) — both shells.
  • Expand verb / cmdlet / alias tables as the corpus surfaces real commands.
  • Document the "consumer's algorithm" — given a ParsedCommand, how a security gate walks it (a docs/CONSUMER_GUIDE.md or SPEC.md appendix).
  • Performance sanity check (~1 ms typical) with a tiny BenchmarkDotNet harness — only if anything in the daemon hot path complains.

LATER (post-0.2.0)

  • v0.2.x candidates from SPEC.POWERSHELL.md §18 — lossless redirect-stream identity (grow the RedirectDirection enum), per-element comma-array path extraction (-Path a,b,c).
  • PowerShell script-level constructs — control flow, function / class / enum definitions, param() / begin / process / end blocks, .ps1 file parsing (SPEC.POWERSHELL.md §18).
  • Extract a shared lexer/parser core now that two parsers exist — the seam can be designed from real duplication (SPEC.POWERSHELL.md §18); the path-normalization helpers duplicated between BashResolver and PwshResolver are the first candidate.
  • Windows cmd parser.
  • Source-mapping (line/column on AST nodes) — only if an IDE consumer asks.
  • Heredoc body extraction, process substitution, bash function definitions — only if a real consumer need surfaces.

Parked

(empty; move items here when scope changes rather than deleting them)