Skip to content

fix: update highlights, injections, and reformat grammar#23

Open
hkimura-intersys wants to merge 3 commits intointersystems:mainfrom
hkimura-intersys:highlights
Open

fix: update highlights, injections, and reformat grammar#23
hkimura-intersys wants to merge 3 commits intointersystems:mainfrom
hkimura-intersys:highlights

Conversation

@hkimura-intersys
Copy link
Collaborator

Overview

This PR refactors the three ObjectScript grammars (core, expr, udl), standardizes highlight and injection queries, and substantially expands grammar test coverage.

Key Changes

  1. Updated highlighting to use faster, cleaner query patterns and more maintainable structure.
  2. Normalized capture names to Zed-compatible conventions so editor themes can resolve captures consistently.
  3. Updated XDATA injections to support YAML (text/yaml and application/yaml) in addition to existing mime types.
  4. Restructured keyword handling by removing keyword fields from keyword node definitions and attaching keyword fields at usage sites (statements/structures/commands).
  5. Fixed expression-grammar global reference parsing so values like ^|"SAMPLES","IRISLIB"|patient(1) parse correctly.
  6. Completed a broad grammar refactor across core, expr, and udl to improve consistency and long-term maintainability while preserving language-server/highlighting utility.
  7. Added 437 unit tests for expression grammar coverage.

Key Changes

  1. Updated highlighting to use faster, cleaner query patterns and more maintainable structure.
  2. Normalized capture names to Zed-compatible conventions so editor themes can resolve captures consistently.
  3. Updated XDATA injections to support YAML (text/yaml and application/yaml) in addition to existing mime types.
  4. Restructured keyword handling by removing keyword fields from keyword node definitions and attaching keyword fields at usage sites (statements/structures/commands).
  5. Fixed expression-grammar global reference parsing so values like ^|"SAMPLES","IRISLIB"|patient(1) parse correctly.
  6. Completed a broad grammar refactor across core, expr, and udl to improve consistency and long-term maintainability while preserving language-server/highlighting utility.
  7. Added 437 unit tests for expression grammar coverage.

Highlights by Area

Highlight Queries

  • Updated:

    • expr/queries/highlights.scm
    • core/queries/highlights.scm
    • udl/queries/highlights.scm
    • bindings/python/tree_sitter_objectscript_core/queries/highlights.scm
  • Result:

    • Cleaner capture mapping
    • Better compatibility with editor theming systems (including Zed capture expectations)

Injection Queries

  • Updated:
    • core/queries/injections.scm
    • udl/queries/injections.scm
  • Result:
    • Better language injection behavior for method/query/trigger/XDATA content
    • Added YAML mime-type handling for XDATA

Grammar Restructure

  • Updated:
    • core/grammar.js
    • expr/grammar.js
    • udl/grammar.js
    • udl/keywords.js
  • Result:
    • Keyword field placement moved to structural usage points
    • Cleaner AST shape for commands/structures
    • Better separation between token definitions and structural intent
    • Fixes for expression edge cases (including namespace-qualified global references)

Test Coverage Expansion

  • Added new expression corpus files under expr/test/corpus/ for broad builtin/system/macro/global/extrinsic coverage.
  • Updated many core/test/corpus/* and udl/test/corpus/* trees to align with grammar/query refactors and new parsing behavior.

Added Documentation

File Purpose
docs/c4.md C4 model architecture diagrams showing system context, container relationships, and component structure with Mermaid diagrams
docs/arc42.md Arc42-style architecture documentation covering goals, constraints, building blocks, runtime behavior, and design decisions
docs/design-doc.md Google-style design document explaining the problem, goals, proposal, alternatives considered, and tradeoffs
docs/code-health-report.md Code quality analysis identifying technical debt, improvement opportunities, and positive patterns
docs/usage-examples.md Practical examples for parsing, writing queries, language injection, and editor integration
docs/data-structures/*.md Detailed documentation for key grammar rules (expression, statement, class_definition, method_definition)

Why This Documentation Helps

  1. For Contributors:

    • Explains the three-layer grammar architecture (exprcoreudl) and why it exists
    • Documents design decisions like keyword field placement and external scanner usage
    • Identifies areas for improvement and future work
    • Provides evidence-based references to specific code locations
  2. For Users/Integrators:

    • Shows how to build, test, and use the parser
    • Provides ready-to-use highlight and injection query examples
    • Documents editor integration steps for Zed, Neovim, and Emacs
    • Includes code examples for Rust, Python, Node.js, and Go bindings
  3. For Maintainers:

    • Code health report identifies technical debt and prioritizes fixes
    • Architecture docs serve as onboarding material for new maintainers
    • Data structure docs explain invariants and update paths for key rules

TESTING

Overview: To test these changes, I ran all the unit tests, tested all the injections, and tested highlighting in tree-sitter playground.

Unit Tests

Verified all tests are passing in all 3 grammars:
image
image
image

Tested Injections:

image image image image image image image

Tested highlighting updates:

Foreignkey:
image

Index:
image

Method:
image

Parameter:
image

Projection:
image

Property:
image

Query:
image

Relationship:
image

Trigger:
image

    - Updated highlighting that is much faster and has better practices. This along with the grammar restructure made it so any future updates will be significantly easier.
    - Made the capture names according to the zed standard (so themes can pick them up )
    - Updated injections for xdata to allow yaml
    - Removed all fields and added a keyword field for all objectscript structures/command names
    - Fixed global variable in expression grammar to allow for something like ^|"SAMPLES","IRISLIB"|patient(1), which before gave an error
    - Complete refactor of all three grammars to follow best practices and lower parser.c file size while also giving the necessary information needed for LS and good highlighting
    - Added 437 unit tests to the expr grammar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant