Skip to content

codegen: emit write_to fields in field-number order, not by kind#78

Merged
iainmcgin merged 3 commits intomainfrom
fix/75-field-number-order
Apr 28, 2026
Merged

codegen: emit write_to fields in field-number order, not by kind#78
iainmcgin merged 3 commits intomainfrom
fix/75-field-number-order

Conversation

@iainmcgin
Copy link
Copy Markdown
Collaborator

Closes #75.

Problem

write_to emitted all singular fields before all repeated fields (and those before oneofs, then maps) — fields were grouped by kind rather than serialized in field-number order. For a message like:

message Changeset {
  bytes tree_id = 1;
  ...
  repeated bytes parents = 6;
  repeated FileChange changes = 7;
  optional bytes signature = 8;
}

buffa emitted tags 1,2,3,4,5,8,6,7; prost / protoc-C++ emit 1,2,3,4,5,6,7,8. Decoders accept any order so round-trip tests pass, but anything content-addressing the serialized bytes (e.g. blake3(encode(msg))) diverges.

Fix

Replaced the four-bucket ClassifiedFields struct with a FieldKind enum and a single classify_fields_ordered() that returns fields sorted by number (oneof groups positioned at their lowest member). generate_message_impl and build_view_encode_methods now iterate that list once, building compute_size / write_to / merge_field / clear token streams in lockstep.

The lockstep matters: post-#22, SizeCache allocates slots via reserve()/set() during compute_size and reads them via consume_next() during write_to in traversal order. Building both from the same loop body keeps them aligned by construction.

Net change in impl_message.rs is −67 lines (excluding new tests).

Tests

  • buffa-test/protos/edge_cases.proto: new FieldOrdering (interleaved scalar/repeated/map/sub-message/oneof) and FieldOrderingShuffled (same fields, declared in reverse number order).
  • edge_cases.rs: tag-sequence assertions for owned and view encode, owned↔view byte equality, golden bytes for the deterministic subset, plus a round-trip.
  • impl_message.rs: unit tests for classify_fields_ordered covering cross-kind ordering and oneof-at-min-number placement.
  • Conformance suite (std / no_std / via-view) unchanged.

Regenerated

buffa-descriptor/src/generated/ reordered (790 lines moved, no logic change). buffa-types/src/generated/ and the logging example were unaffected — none of the WKTs interleave field kinds.

Replace the four per-kind buckets in classify_fields with a single
field-number-sorted pass that dispatches to the per-kind stmt builders
inline. compute_size and write_to are built in the same loop body, so the
SizeCache reserve/consume_next traversal order stays aligned by
construction. Oneof groups are positioned at their lowest member field
number. Applied to both the owned and view encode paths.

Fixes byte-level divergence from prost / protoc-C++ for messages that mix
a high-numbered singular field with a lower-numbered repeated/map/oneof.

Closes #75
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

Route the sort key through validated_field_number() so a hand-crafted
descriptor with a missing or out-of-range field.number surfaces as a
CodeGenError instead of silently sorting to position 0. The
oneof_index unwrap is provably safe (guarded by is_real_oneof_member)
and now documents that with .expect().
@iainmcgin iainmcgin marked this pull request as ready for review April 28, 2026 19:47
@iainmcgin iainmcgin enabled auto-merge (squash) April 28, 2026 20:16
@iainmcgin iainmcgin merged commit 8d51329 into main Apr 28, 2026
7 checks passed
@iainmcgin iainmcgin deleted the fix/75-field-number-order branch April 28, 2026 20:24
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

codegen: write_to emits singular fields before repeated fields, not in field-number order

2 participants