Skip to content

feat(avro-to-json): add wireFormat configuration parameter#357

Merged
WilliamBerryiii merged 3 commits intomainfrom
feat/337-avro-wire-config
Apr 7, 2026
Merged

feat(avro-to-json): add wireFormat configuration parameter#357
WilliamBerryiii merged 3 commits intomainfrom
feat/337-avro-wire-config

Conversation

@katriendg
Copy link
Copy Markdown
Collaborator

Description

Added an optional wireFormat configuration parameter to the avro-to-json WASM operator, enabling deterministic Avro parsing when the wire format of incoming messages is known. The parameter accepts three values — auto, confluent, and raw — and defaults to auto, which preserves the existing try-raw-then-retry behavior for full backward compatibility.

Rust Wasm Operator

The core change introduced a WireFormat enum and a WIRE_FORMAT OnceLock static in lib.rs. The existing parse_with_schema function was refactored: a new parse_with_schema_inner function accepts the wire format explicitly and dispatches to the correct parsing path. The original parse_with_schema now delegates to the inner function using the configured format, keeping the public API unchanged.

Configuration parsing in avro_init reads the wireFormat property with case-insensitive matching, logs a warning for unknown values (defaulting to Auto), and logs the selected format at Info level for observability.

Seven unit tests cover all three variants — raw success/failure, confluent success/failure, auto success on both raw and confluent data, and a short-payload edge case for confluent mode.

Graph YAML and Blueprint

The graph-avro-to-json.yaml gained a wireFormat parameter declaration in moduleConfigurations. The blueprint dataflow-graphs-avro-json.tfvars.example added a wireFormat key-value entry with inline documentation of all three options, demonstrating the Confluent use case.

Version

Package version bumped from 1.0.0 to 1.1.0 in both Cargo.toml and Cargo.lock.

Related Issue

Fixes #337

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Blueprint modification or addition
  • Component modification or addition
  • Documentation update
  • CI/CD pipeline change
  • Other (please describe):

Implementation Details

  • Introduced WireFormat enum (Auto, Confluent, Raw) with OnceLock-based static initialization
  • Extracted parse_with_schema_inner(data, schema, wire_format) from parse_with_schema for testable, format-specific dispatch
  • avro_init parses the wireFormat configuration property with case-insensitive matching and defaults to Auto for unknown or missing values
  • Error messages are mode-specific for production debugging (e.g., Confluent mode reports missing 5-byte prefix with payload size)
  • Graph YAML declares the new parameter for operator discovery
  • Blueprint tfvars example documents all three options with the Confluent use case

Testing Performed

  • Terraform plan/apply
  • Blueprint deployment test
  • Unit tests
  • Integration tests
  • Bug fix includes regression test (see Test Policy)
  • Manual validation
  • Other:

Validation Steps

  1. Review the WireFormat enum and parse_with_schema_inner dispatch logic in lib.rs
  2. Run cargo test in src/500-application/512-avro-to-json/operators/avro-to-json/ — all 7 new wireFormat tests should pass
  3. Verify the graph YAML declares wireFormat in moduleConfigurations
  4. Verify the blueprint tfvars example includes the wireFormat key-value entry
  5. Confirm version bump to 1.1.0 in Cargo.toml
  6. Run a blueprint deployment with the updated graph and validate correct parsing behavior for each wire format option (requires appropriate test messages for raw and Confluent formats)

Checklist

  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have run terraform fmt on all Terraform code
  • I have run terraform validate on all Terraform code
  • I have run az bicep format on all Bicep code
  • I have run az bicep build to validate all Bicep code
  • I have checked for any sensitive data/tokens that should not be committed
  • Lint checks pass (run applicable linters for changed file types)

Security Review

  • No credentials, secrets, or tokens are hardcoded or logged
  • RBAC and identity changes follow least-privilege principles
  • No new network exposure or public endpoints introduced without justification
  • Dependency additions or updates have been reviewed for known vulnerabilities
  • Container image changes use pinned digests or SHA references

Additional Notes

This is a non-breaking, additive change. The default auto behavior matches the current implementation exactly. Existing deployments require no configuration changes.

Add optional wireFormat parameter with values auto, confluent, raw
to enable deterministic Avro parsing. Extract parse_with_schema_inner
for testability. Update graph YAML, blueprint tfvars example, and
README documentation. Bump version to 1.1.0.
@katriendg katriendg requested a review from a team as a code owner April 7, 2026 14:28
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

📚 Documentation Health Report

Generated on: 2026-04-07 14:31:54 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 235
Infrastructure Components 191
Blueprints 39
Learning Platform 89
GitHub Resources 44
AI Assistant Guides (Copilot) 17
Total 615

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Sidebar Generation: success
  • Link Validation: success
  • Build Test: skipped

This report is automatically generated by the Documentation Automation workflow.

- Add commit 9d49582 to .gitleaks.toml commit allowlist
- SECRETLINT_FAILED=true shell variable flagged as generic-api-key

🔒 - Generated by Copilot
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

📚 Documentation Health Report

Generated on: 2026-04-07 15:14:41 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 235
Infrastructure Components 191
Blueprints 39
Learning Platform 89
GitHub Resources 44
AI Assistant Guides (Copilot) 17
Total 615

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Sidebar Generation: success
  • Link Validation: success
  • Build Test: skipped

This report is automatically generated by the Documentation Automation workflow.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

📚 Documentation Health Report

Generated on: 2026-04-07 15:41:56 UTC

📈 Documentation Statistics

Category File Count
Main Documentation 235
Infrastructure Components 191
Blueprints 39
Learning Platform 89
GitHub Resources 44
AI Assistant Guides (Copilot) 17
Total 615

🏗️ Three-Tree Architecture Status

  • ✅ Bicep Documentation Tree: Auto-generated navigation
  • ✅ Terraform Documentation Tree: Auto-generated navigation
  • ✅ README Documentation Tree: Manual README organization

🔍 Quality Metrics

  • Frontmatter Validation:
    success
  • Sidebar Generation: success
  • Link Validation: success
  • Build Test: skipped

This report is automatically generated by the Documentation Automation workflow.

@WilliamBerryiii
Copy link
Copy Markdown
Member

Filed #359 to track the missing avro_init configuration parsing tests.

@WilliamBerryiii WilliamBerryiii merged commit e5d1833 into main Apr 7, 2026
38 checks passed
@WilliamBerryiii WilliamBerryiii deleted the feat/337-avro-wire-config branch April 7, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(avro-to-json): add wireFormat configuration parameter for deterministic Avro parsing

4 participants