diff --git a/scratchpad/dev.md b/scratchpad/dev.md index 2c57c9b329f..2d398ceb3cb 100644 --- a/scratchpad/dev.md +++ b/scratchpad/dev.md @@ -1,7 +1,7 @@ # Developer Instructions -**Version**: 2.9 -**Last Updated**: 2026-02-24 +**Version**: 3.0 +**Last Updated**: 2026-02-25 **Purpose**: Consolidated development guidelines for GitHub Agentic Workflows This document consolidates specifications from the scratchpad directory into unified developer instructions. It provides architecture patterns, security guidelines, code organization rules, and testing practices. @@ -456,6 +456,42 @@ func validateIssueConfig(cfg CreateIssueConfig) error { } ``` +### YAML Parser Compatibility + +GitHub Agentic Workflows uses **`goccy/go-yaml` v1.18.0** (YAML 1.2 compliant parser). This affects validation behavior and tool integration. + +**YAML 1.1 vs 1.2 boolean parsing**: + +YAML 1.1 parsers (including Python's PyYAML and many older tools) treat certain plain scalars as booleans: + +| Keyword | YAML 1.1 Value | YAML 1.2 Value (gh-aw) | +|---------|----------------|------------------------| +| `on`, `yes`, `y`, `ON`, `Yes` | `true` (boolean) | string `"on"`, `"yes"`, etc. | +| `off`, `no`, `n`, `OFF`, `No` | `false` (boolean) | string `"off"`, `"no"`, etc. | + +Only `true` and `false` are boolean literals in YAML 1.2. GitHub Actions also uses YAML 1.2, so gh-aw's parser choice ensures full compatibility. + +**Impact on the `on:` trigger key**: Python's `yaml.safe_load` parses the workflow trigger key `on:` as the boolean `True`, producing false positives when validating gh-aw workflows. The workflow is valid — the Python tool is applying the wrong spec version. + +**Correct local validation**: +```bash +# ✅ Use gh-aw's built-in compiler (YAML 1.2 compliant) +gh aw compile workflow.md +``` + +**Avoid**: +```bash +# ❌ Reports false positives — on: key becomes boolean True +python -c "import yaml; yaml.safe_load(open('workflow.md'))" +``` + +**For tool developers** integrating with gh-aw, use YAML 1.2-compliant parsers: +- Go: `github.com/goccy/go-yaml` (used by gh-aw) +- Python: `ruamel.yaml` (not PyYAML) +- JavaScript: `yaml` package v2+ + +See `scratchpad/yaml-version-gotchas.md` for the full keyword reference and migration guidance. + --- ## Safe Outputs System @@ -2008,6 +2044,7 @@ These files are loaded automatically by compatible AI tools (e.g., GitHub Copilo - [Activation Output Transformations](./activation-output-transformations.md) - Compiler expression transformation details - [HTML Entity Mention Bypass Fix](./html-entity-mention-bypass-fix.md) - Security fix: entity-encoded @mention bypass - [Template Syntax Sanitization](./template-syntax-sanitization.md) - T24: template delimiter neutralization +- [YAML Version Gotchas](./yaml-version-gotchas.md) - YAML 1.1 vs 1.2 parser compatibility: `on:` key behavior, false positive prevention ### External References @@ -2019,6 +2056,7 @@ These files are loaded automatically by compatible AI tools (e.g., GitHub Copilo --- **Document History**: +- v3.0 (2026-02-25): Added YAML Parser Compatibility section (YAML 1.1 vs 1.2 boolean parsing, `on:` trigger key false positive, YAML 1.2 parser recommendations); added yaml-version-gotchas.md to Related Documentation; fixed 17 non-standard closing code fences in yaml-version-gotchas.md - v2.9 (2026-02-24): Added Engine Interface Architecture (ISP 7-interface design, BaseEngine, EngineRegistry), JavaScript Content Sanitization Pipeline with HTML entity bypass fix (T24 template delimiter neutralization), and Activation Output Transformations compiler behavior; added 4 new Related Documentation links - v2.8 (2026-02-23): Documented PR #17769 features: unassign-from-user safe output, blocked deny-list for assign/unassign, standardized error code registry, templatable integer fields, safe outputs prompt template system, XPIA defense policy, MCP template expression escaping, status-comment decoupling, sandbox.agent migration, agent instruction files in .github/agents/ - v2.6 (2026-02-20): Fixed 8 tone issues across 4 spec files, documented post-processing extraction pattern and CLI flag propagation rule from PR #17316, analyzed 61 files diff --git a/scratchpad/yaml-version-gotchas.md b/scratchpad/yaml-version-gotchas.md index baa0113ada7..2520cda232e 100644 --- a/scratchpad/yaml-version-gotchas.md +++ b/scratchpad/yaml-version-gotchas.md @@ -45,7 +45,7 @@ result = yaml.safe_load(content) print(result) # Output: {True: {'issues': {'types': ['opened']}}} # ^^^^ The key is boolean True, not string "on"! -```text +``` This creates a **false positive** when validating workflows with Python-based tools, making it appear that the YAML is invalid when it's actually correct. @@ -77,7 +77,7 @@ on: // Output: map[on:map[issues:map[types:[opened]]]] // ^^^ The key is string "on" ✓ } -```text +``` ## How gh-aw Handles This @@ -104,7 +104,7 @@ YES: # → true Yes: # → true ON: # → true On: # → true -```text +``` ### YAML 1.1 Boolean Keywords (Parsed as `false`) @@ -117,7 +117,7 @@ NO: # → false No: # → false OFF: # → false Off: # → false -```text +``` ### YAML 1.2 Behavior @@ -128,7 +128,7 @@ true: # → true false: # → false True: # → true (case-insensitive in some parsers) False: # → false (case-insensitive in some parsers) -```text +``` ## Code Examples @@ -143,7 +143,7 @@ on: permissions: issues: write --- -```text +``` **YAML 1.1 Parser (Python):** ```python @@ -152,7 +152,7 @@ content = open('workflow.md').read().split('---')[1] data = yaml.safe_load(content) print(type(list(data.keys())[0])) # print(list(data.keys())[0]) # True -```text +``` **YAML 1.2 Parser (gh-aw / goccy/go-yaml):** ```go @@ -160,7 +160,7 @@ var data map[string]interface{} yaml.Unmarshal([]byte(content), &data) fmt.Printf("%T\n", "on") // string fmt.Printf("%v\n", data["on"]) // map[issues:...] -```text +``` ### Example 2: Configuration Value @@ -170,7 +170,7 @@ settings: enabled: yes disabled: no mode: on -```text +``` **YAML 1.1 Parser Output:** ```python @@ -181,7 +181,7 @@ settings: 'mode': True # Boolean (the string "on" became True!) } } -```text +``` **YAML 1.2 Parser Output:** ```go @@ -192,7 +192,7 @@ map[string]interface{}{ "mode": "on", // String }, } -```text +``` ### Example 3: Issue Labels @@ -202,7 +202,7 @@ labels: - bug - on hold # Might be interpreted as "on: hold" with boolean key - off topic # Might be interpreted as "off: topic" with boolean key -```text +``` **Safe Approach:** ```yaml @@ -210,7 +210,7 @@ labels: - bug - "on hold" # Quote to force string interpretation - "off topic" # Quote to force string interpretation -```text +``` ## Impact on Validation @@ -223,7 +223,7 @@ Many developers use Python-based YAML validation tools during local development. ```bash $ python -c "import yaml; yaml.safe_load(open('workflow.md'))" # Error: Invalid structure - key is boolean True instead of string "on" -```text +``` **This is NOT a real error!** The workflow is valid and will work correctly with gh-aw. @@ -239,7 +239,7 @@ $ gh aw compile workflow.md # Or use a YAML 1.2 validator $ yamllint --version # Check if it supports YAML 1.2 -```text +``` ## Recommendations @@ -311,7 +311,7 @@ jobs: # Validate all workflows - run: gh aw compile -```text +``` ## Workarounds @@ -326,7 +326,7 @@ If you need to use YAML 1.1 tools (like Python's `yaml.safe_load`) for some reas issues: types: [opened] --- -```text +``` **Option 2: Use Alternative Trigger Names** ```yaml @@ -334,7 +334,7 @@ If you need to use YAML 1.1 tools (like Python's `yaml.safe_load`) for some reas # Not applicable - "on" is required by GitHub Actions # This workaround doesn't actually work for workflows --- -```yaml +``` **Recommendation:** Don't use YAML 1.1 tools for gh-aw workflows. Use gh-aw's compiler instead.