Skip to content

fix(upsert): canonical JSON Schema for handler tool registrations#235

Merged
chubes4 merged 1 commit into
mainfrom
fix-upsert-canonical-schemas
May 9, 2026
Merged

fix(upsert): canonical JSON Schema for handler tool registrations#235
chubes4 merged 1 commit into
mainfrom
fix-upsert-canonical-schemas

Conversation

@chubes4
Copy link
Copy Markdown
Member

@chubes4 chubes4 commented May 9, 2026

Why

Follow-up to #232 (chat tools canonical schemas).

Data Machine 0.106.1 PR #1900 ("Require canonical AI tool schemas") removes the auto-normalization layer in RequestBuilder::normalizeToolParameters that previously converted legacy property-keyed parameter shapes into canonical JSON Schema before sending to providers. After 0.106.1, whatever a tool registration emits is sent to OpenAI as the tool schema literal — so handler-stage tools must already be canonical.

PR #232 converted the chat tools (static schemas in inc/Api/Chat/Tools/). This PR completes the work for the handler-stage upsert_event tool registered via the dm_handlers filter, whose schema is assembled at runtime by merging output from four provider methods.

⚠ Backwards-compatibility warning

This PR is NOT backwards-compatible with DM 0.103.x at the provider return-shape level. EventSchemaProvider::getCoreToolParameters(), EventSchemaProvider::getSchemaToolParameters(), EventSchemaProvider::getAllToolParameters(), and VenueParameterProvider::getToolParameters() now return canonical fragments ({ properties, required? }) instead of flat property bags ({ name => def }).

The only consumer of these methods inside this plugin is EventUpsertFilters::getDynamicEventTool(), which is updated in the same commit. There are no tests pinning the old shape (verified via grep -rn ... tests/).

Must merge + deploy in lockstep with the DM 0.106.1 upgrade so the runtime schema shape matches what the provider expects. Canonical schemas are accepted by both 0.103.x (via the soon-to-be-removed normalizer passthrough) and 0.106.1+, so this PR can ship cleanly ahead of or alongside the DM bump.

Caller audit

Ran:
```
grep -rn 'getCoreToolParameters|getSchemaToolParameters|getTaxonomyToolParameters|VenueParameterProvider::getToolParameters' inc/ tests/
grep -rn 'registerAITools|register_tool|registerTool' inc/Steps/
```

Symbol Caller (this repo) Caller (other repos) Action
`EventSchemaProvider::getCoreToolParameters` `EventUpsertFilters` none converted (this PR)
`EventSchemaProvider::getSchemaToolParameters` `EventUpsertFilters` none converted (this PR)
`EventSchemaProvider::getAllToolParameters` none in inc/ or tests/ unknown converted (this PR)
`VenueParameterProvider::getToolParameters` `EventUpsertFilters` none converted (this PR)
`TaxonomyHandler::getTaxonomyToolParameters` `EventUpsertFilters` `data-machine` core: `Steps/Publish/Handlers/WordPress/WordPress.php` NOT touched — see below
Handler tool registrations under `inc/Steps/` only `EventUpsertFilters` (one site) n/a covered

Cross-repo: `TaxonomyHandler` lives in data-machine core

`TaxonomyHandler::getTaxonomyToolParameters()` is owned by the data-machine plugin (`inc/Core/WordPress/TaxonomyHandler.php`), not this repo. It is also called from data-machine's own `WordPress.php` publish handler, which itself emits a legacy property-keyed schema with per-property `required => true|false` booleans. I did not modify those files per the orchestrator constraint to not touch other repos.

To stay decoupled, `EventUpsertFilters::composeCanonicalParameters()` accepts both shapes:

  • canonical fragment (`{ properties: {...}, required?: [...] }`) — used by my providers
  • flat property bag (`{ name => def, ... }`) — used by data-machine's `TaxonomyHandler`

The flat shape is treated as a degenerate fragment with no required fields. As long as data-machine's `TaxonomyHandler` continues to emit valid property definitions (each with `type`, `description`, and `items` for arrays — confirmed by reading the source), this PR works whether or not data-machine core is ever updated to canonical fragments.

🚩 Blocker for orchestrator review (out of scope here)

`data-machine` core's WordPress publish handler (`Steps/Publish/Handlers/WordPress/WordPress.php`, around line 47–80) still emits a legacy property-keyed schema with per-property `required` booleans. That handler will fail OpenAI validation on DM 0.106.1 the same way the upsert_event handler would have failed without this PR. Needs a sibling fix in the data-machine repo before 0.106.1 can deploy cleanly to any site that uses the WordPress publish handler. This PR does not address that — surfacing it here for tracking.

Files changed

  • `inc/Core/DynamicToolParametersTrait.php` — `filterByEngineData()` now operates on canonical fragments and prunes filtered keys from both `properties` and `required`.
  • `inc/Core/EventSchemaProvider.php` — `fieldsToToolParameters()` emits canonical fragments. Per-property `required => true` booleans are aggregated into a top-level `required[]` array. `required` key omitted when empty. Per-property `required` keys removed.
  • `inc/Core/VenueParameterProvider.php` — `TOOL_PARAMETERS` constant cleaned up (removed redundant `required => false`). `getToolParameters()` and `getAllParameters()` return canonical fragments.
  • `inc/Steps/Upsert/Events/EventUpsertFilters.php` — `getDynamicEventTool()` composes fragments via new `composeCanonicalParameters()` private helper. Output is a canonical `{ type: object, properties, required? }` schema.

Test plan

  • PHP syntax: `php -l` clean on all 4 changed files.
  • homeboy lint: `homeboy lint --path . --changed-since main` reports zero findings on any of the 4 edited files (only pre-existing baseline noise in unrelated `UniversalWebScraper.php` etc., consistent with PR fix(tools): canonical JSON Schema for chat tools #232).
  • Functional smoke test — ran a script that loads the providers, calls each method, and verifies:
    • `getCoreToolParameters()` returns `{ properties: 7 keys, required: ['title','startTime','description'] }`
    • `occurrenceDates` carries `items: { type: 'string' }`
    • Engine-data filter on `startTime` removes it from both `properties` and `required` (final `required: ['title','description']`)
    • `getToolParameters()` from `VenueParameterProvider` returns a fragment with 11 properties and no `required` key (correct — venue presence enforced upstream).
  • End-to-end composition test — simulated the full `EventUpsertFilters` path with a flat-shape `TaxonomyHandler` mock. Result: 33 merged properties, correct `required: ['title','startTime','description']`, all array types carry `items`, no spurious required entries from venue/taxonomy fragments. Final shape is canonical `{ type: object, properties, required }`.
  • Live tool call — exercise the upsert_event handler in a flow run on DM 0.103.14 (current production) to confirm canonical schema passes the (still-active) normalizer; then again post-DM-0.106.1 to confirm passthrough acceptance.

Constraints honored

  • Branched off `main`, no commits to `main`.
  • No `CHANGELOG.md` or version-constant edits — homeboy owns those.
  • No deploy, no release.
  • No edits in `wp-content/plugins/` on the VPS.
  • No additional minions spawned.
  • No edits in other repos (`data-machine`, etc.). Cross-repo concern surfaced as a blocker note above.

Follow-up to PR #232. Converts the upsert_event handler tool
registration to emit a canonical JSON Schema (type: object,
properties, required) instead of a flat property bag.

Required for DM 0.106.1 compatibility. PR #1900 in data-machine
core makes RequestBuilder::normalizeToolParameters a passthrough,
so handler tool definitions must already be canonical when registered.

NOT backwards-compatible with DM 0.103.x at the provider return shape
level: EventSchemaProvider and VenueParameterProvider now return
canonical fragments ({properties, required?}) instead of flat
property bags. The handler tool wire-up (EventUpsertFilters) is the
only consumer of these providers in this plugin. No tests pin the
old shape.

TaxonomyHandler::getTaxonomyToolParameters() in data-machine core
still returns a flat property bag. EventUpsertFilters' composer
absorbs that shape transparently as a degenerate fragment with no
required fields, so this PR works regardless of when DM core
converts TaxonomyHandler to canonical fragments.

Changes:
- DynamicToolParametersTrait::filterByEngineData() now operates on
  canonical fragments and prunes filtered keys from both 'properties'
  and 'required'.
- EventSchemaProvider::fieldsToToolParameters() emits canonical
  fragments. Per-property 'required' booleans are aggregated into a
  top-level required[] array. 'required' key omitted when empty.
- VenueParameterProvider returns a canonical fragment with no
  required fields (venue presence is enforced upstream).
- EventUpsertFilters::getDynamicEventTool() composes fragments via
  new composeCanonicalParameters() helper. Tolerates both canonical
  fragment shape and legacy flat property bag from TaxonomyHandler.
- All array-typed properties carry an 'items' schema (already in
  source declarations; preserved through fieldsToToolParameters).
@homeboy-ci
Copy link
Copy Markdown
Contributor

homeboy-ci Bot commented May 9, 2026

Homeboy Results — data-machine-events

Audit

audit — passed

  • test_coverage — 4 finding(s)
  • parallel-implementation — 2 finding(s)
  • dead_code — 1 finding(s)
  • Total: 7 finding(s)

Deep dive: homeboy audit data-machine-events --changed-since c629133

Tooling versions
  • Homeboy CLI: homeboy 0.163.1+a300b0b
  • Extension: wordpress from https://github.com/Extra-Chill/homeboy-extensions
  • Extension revision: 56b5c09
  • Action: Extra-Chill/homeboy-action@v2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant