Skip to content

Namespace ancillary codegen output into parallel kind-trees (view::, oneofs::, ext::)#54

Closed
iainmcgin wants to merge 4 commits intomainfrom
prototype/codegen-namespacing
Closed

Namespace ancillary codegen output into parallel kind-trees (view::, oneofs::, ext::)#54
iainmcgin wants to merge 4 commits intomainfrom
prototype/codegen-namespacing

Conversation

@iainmcgin
Copy link
Copy Markdown
Collaborator

@iainmcgin iainmcgin commented Apr 17, 2026

Summary

Ancillary types emitted by buffa-codegen — borrowed views, oneof enums, oneof view enums, extension consts, and the per-file registration fn — previously lived in the same Rust namespace as user-declared proto types and could collide with them. For example, a message FooView in a .proto would clash with the generated view type of sibling message Foo; message DataType { oneof data_type {...} } would put a struct and an enum of the same name side by side.

This PR moves every ancillary generated type into a dedicated kind-namespaced parallel tree (view::, oneofs::, ext::, view::oneofs::), structurally deconflicting the codegen output from proto-derived names. The design rule (now documented in DESIGN.md) is:

Ancillary kinds are outermost; modifiers (only view today) wrap them; inside every tree the proto message path is mirrored.

Layout changes

Before After
pkg::FooView pkg::view::FooView
pkg::foo::BarView (nested msg view) pkg::view::foo::BarView
pkg::foo::KindOneof (oneof enum) pkg::oneofs::foo::Kind
pkg::foo::KindOneofView (oneof view enum) pkg::view::oneofs::foo::Kind
pkg::FOO (extension const) pkg::ext::FOO
pkg::register_types pkg::ext::register_types

Suffix policy

  • Message view structs keep the View suffix (pkg::view::foo::FooView) — this is the one documented exception to the "tree disambiguates, so drop the suffix" rule. Rationale: users commonly import a message and its view into the same scope (use pkg::{Foo, view::FooView};), and a scope-level rename would be friction for every call site.
  • Oneof enums and oneof view enums both drop their suffixes — the parallel-tree position is the disambiguator, and neither name is commonly co-imported with an unrelated type called Kind.

Co-import ergonomics

use pkg::{Foo, view::FooView};
fn process(m: &Foo, v: FooView<'_>) { /* ... */ }

Codegen internals

  • GeneratedFile gains kind: GeneratedFileKind and package: String fields. GeneratedFileKind has five variants: Owned, View, Ext, Oneofs, ViewOneofs. Each proto produces five sibling output files (.rs, .__view.rs, .__ext.rs, .__oneofs.rs, .__view_oneofs.rs); empty bodies when the kind has no content for that proto.
  • generate_module_tree accepts &[ModuleTreeEntry<'_>] and coalesces per-package streams across multiple .proto files so packages that span files (e.g. google.rpc) merge their view:: / ext:: / oneofs:: / view::oneofs:: blocks into single wrappers per kind.
  • CodeGenError::OneofNameConflict and the reserved-name machinery around oneof-vs-nested-type collisions are deleted — the collisions they guarded against are now structurally impossible.
  • Checked-in generated code (WKTs + bootstrap descriptor types) fully regenerated. Bootstrap uses generate_views=false, so its .__view*.rs siblings are empty stubs; the file count is kept consistent for the module tree stitcher.

Deconfliction wins

Patterns that were previously rejected at codegen time (or required OneofNameConflict errors with proto renames) now compile as-is:

  • message DataType { oneof data_type {...} }
  • message Foo { nested message Kind {...} oneof kind {...} }
  • message M { nested message FooView {...} oneof foo {...} }
  • Sibling oneofs my_field and my_field_view

DESIGN.md

Adds a "Generated code layout — parallel trees per kind" section covering the formula, the five-sibling invariant, the path-table worked example for my.pkg.Foo, the View-suffix exception with rationale, and the three structural benefits (collision-free / predictable / feature-gatable per kind).

Migration for downstream consumers

// Before
use pkg::{Foo, FooView, FOO, register_types};
use pkg::foo::{KindOneof, InnerMessage};

// After
use pkg::Foo;
use pkg::view::FooView;
use pkg::ext::{FOO, register_types};
use pkg::oneofs::foo::Kind;
use pkg::foo::InnerMessage;

Test plan

  • cargo check --workspace clean
  • task test — 1448 tests pass (18 suites), 0 failed
  • task test-codegen — snapshot tests pass
  • task lintcargo fmt --check and cargo clippy --workspace --all-targets -- -D warnings both clean
  • task lint-md — markdownlint clean
  • task conformance — 6 CONFORMANCE SUITE PASSED lines, baseline numbers match exactly (std 5549 / no_std 5529 / view 2797 binary+JSON; 883/883/0 text)
  • task stress-googleapis — 235 files compile cleanly (including multi-file google.rpc)
  • Three previously-inverted tests/naming.rs collision tests are further updated to coexist assertions; OneofNameConflict test references removed with the variant

Diff breakdown by category

Total: 113 files, +6339 / -3562.

Category Files +lines -lines Notes
Core codegen source (buffa-codegen/src/*.rs, non-test) 7 744 320 Path resolution + module-tree stitching for the full kind matrix, including oneofs and view-oneofs wrappers. Net +424.
Codegen tests (snapshot + integration) 8 423 261 Snapshot updates + coexist regression tests for the deconfliction patterns. Net +162.
buffa-build API adaptation 1 103 42 Plumbing the five-kind ModuleTreeEntry through the build-script path. Net +61.
buffa-test consumer test suite 14 289 205 Mechanical import-path migration across the end-to-end suite. Net +84.
buffa-types / buffa-descriptor hand-maintained 5 96 37 lib.rs / mod.rs hand-stitch the pub mod view { pub mod oneofs {} }, pub mod oneofs, pub mod ext wrappers. Net +59.
Conformance runner + examples (addressbook, envelope, logging) 7 230 33 Path reference updates + regenerated logging example. Net +197.
protoc-gen-buffa-packaging 1 44 13 Emits five ModuleTreeEntrys per proto. Net +31.
Googleapis stress tooling (gen_lib_rs.py) 1 74 25 Groups the two new sibling kinds and emits the oneofs / view_oneofs wrappers. Net +49.
DESIGN.md 1 45 0 New "Generated code layout" section. Net +45.
Subtotal — human-authored 45 2048 936 Net +1112
Regenerated WKT types (buffa-types/src/generated/) 35 3169 2588 7 protos × 5 siblings, with inlined view/oneof code relocated to sibling files. Net +581 (mostly because the sibling-file scheme adds some boilerplate per kind).
Regenerated bootstrap descriptor types (buffa-descriptor/src/generated/) 12 38 4 Regenerated mod.rs + sibling stubs. Net +34.
Regenerated examples/logging output 21 1084 34 task gen-logging-example output after the layout move. Net +1050.
Subtotal — regenerated 68 4291 2626 Net +1665

Human-authored change: ~2048 / 936 (32% / 26%) of the total. The rest is codegen-regenerated output where the file count expands because every proto now emits five siblings, but the per-file content shrinks correspondingly.

Breaking changes

API-breaking for downstream consumers of generated code. Pre-1.0 this is expected; the migration is mechanical (adjust import paths per the table above).

@iainmcgin iainmcgin force-pushed the prototype/codegen-namespacing branch from ae6dd3b to e46e6b9 Compare April 18, 2026 00:15
@iainmcgin iainmcgin force-pushed the prototype/codegen-namespacing branch from e46e6b9 to 8319105 Compare April 22, 2026 20:19
@iainmcgin iainmcgin changed the title Namespace ancillary codegen output into view:: and ext:: sub-modules Namespace ancillary codegen output into parallel kind-trees (view::, oneofs::, ext::) Apr 22, 2026
Moves generated types that previously collided with user-declared proto
names into dedicated sub-modules, structurally deconflicting the codegen
output from proto-derived names.

Layout changes:

- Views: `pkg::FooView` → `pkg::view::FooView` (top-level), and
  `pkg::foo::BarView` → `pkg::view::foo::BarView` (nested). Oneof view
  enums follow the same tree: `pkg::view::foo::KindView`. The `View`
  suffix is retained so that owned and view types can be imported into
  the same scope without renaming.

- Extensions: `pkg::FOO` → `pkg::ext::FOO`, and per-file
  `pkg::register_types` → `pkg::ext::register_types`.

- Oneof enums: `foo::KindOneof` → `foo::Kind`. The suffix is no longer
  needed for disambiguation; the owner message's sub-module provides the
  namespace.

Oneof enums themselves remain in the owner message's sub-module
(`pkg::foo::Kind`). The view-tree move makes several previously-illegal
proto patterns legal — two regression tests in `naming.rs` capture this.

Codegen internals:

- `GeneratedFile` gains `kind: GeneratedFileKind` + `package: String`
  fields. Each generated proto now produces three sibling output files
  (`.rs`, `.__view.rs`, `.__ext.rs`); `generate_module_tree` coalesces
  them per package so multiple `.proto` files that share a Rust module
  (e.g. `google.rpc`) merge their `view::` / `ext::` blocks cleanly.

- `generate_module_tree` signature widens from `&[(&str, &str)]` to
  `&[ModuleTreeEntry<'_>]` to carry the kind.

- `MessageScope::deeper()` is used for depth-aware path resolution in the
  parallel view tree.

Checked-in generated code regenerated (WKTs + bootstrap descriptor types).
Conformance suite, googleapis stress test, and the existing test suite
(1442 tests) all pass.
Post-rebase onto main (which now includes #39): convert the
prelude_shadow include to include_generated! and update the oneof
enum path to oneofs::picker::option::Value (no Oneof suffix under
namespacing).
@iainmcgin iainmcgin force-pushed the prototype/codegen-namespacing branch from 8319105 to e62b8d0 Compare April 23, 2026 17:27
@iainmcgin iainmcgin marked this pull request as ready for review April 23, 2026 17:27
@iainmcgin iainmcgin requested a review from rpb-ant April 23, 2026 17:31
The benchmarks/buffa and benchmarks/gen-datasets crates use hand-written
include!() blocks (not the auto-generated _include.rs), so they were
silently broken under namespacing — generated owned-struct code
references super::oneofs::... which only resolves when the four
.__view/.__oneofs/.__view_oneofs/.__ext files are all included. CI did
not catch this because these crates are outside the workspace.

- Add a local include_generated! macro to both crates (mirrors the
  buffa-test pattern).
- Update view-type and oneof-enum imports in benches/protobuf.rs and
  gen-datasets/main.rs to the namespaced paths.
@iainmcgin iainmcgin force-pushed the prototype/codegen-namespacing branch from 9ae726f to 0482699 Compare April 23, 2026 18:13
@iainmcgin iainmcgin enabled auto-merge (squash) April 23, 2026 18:36
@rpb-ant rpb-ant self-assigned this Apr 23, 2026
@iainmcgin
Copy link
Copy Markdown
Collaborator Author

[claude code] Addressed the kind-tree collision feedback: all three ancillary modules now live under a single pub mod __buffa { view / oneofs / ext } wrapper at each package depth, so message View { message Inner } and package foo.view no longer collide (regression tests added for both). The __buffa prefix is validated against message names, enum names, and package segments — CodeGenError::ReservedName rather than rustc E0428.

Kept the existing tree shape under the wrapper (__buffa::view::oneofs:: rather than reorganizing to __buffa::oneof::view::) to minimize the delta. No conditional re-exports for now — paths are pkg::__buffa::view::FooView etc.; happy to revisit ergonomics in a follow-up.

Also fixed a latent issue found during this: benchmarks/{buffa,gen-datasets} use hand-written include! blocks (not _include.rs) so they were silently broken — now use the four-file pattern.

One known follow-up: buffa-build::generate_include_file and buffa-codegen::generate_module_tree now hold near-identical __buffa { view { oneofs } / ext / oneofs } stitcher logic — worth extracting a shared helper to prevent drift.

Addresses the collision regression rpb identified: the unguarded
pub mod view {} / pub mod ext {} / pub mod oneofs {} wrappers at every
package depth could collide with proto-derived module names — e.g. a
user message named View with a nested type, or package foo alongside
package foo.view. Both produced E0428 with no detection.

All ancillary kind trees now live under a single pub mod __buffa { ... }
parent at each package level. The __buffa prefix is the only reserved
name codegen claims, and check_module_name_conflicts now rejects any
proto message whose snake_case starts with __buffa (new ReservedName
error variant), so the wrapper can never collide.

Path layout (minimal-delta — tree shape under __buffa unchanged):
  pkg::Foo                                   (owned, unchanged)
  pkg::__buffa::view::FooView<'a>
  pkg::__buffa::view::oneofs::foo::Kind<'a>
  pkg::__buffa::oneofs::foo::Kind
  pkg::__buffa::ext::CONST / register_types()

Codegen changes:
- view.rs: initial nesting 1→2; view_oneofs (n-2) supers;
  owned_oneofs from view (n-1) supers; rewrite_to_view_path strips two
  super:: hops same-package, injects __buffa::view:: cross-package.
- message.rs: oneofs_path_prefix emits __buffa::oneofs::; oneof-enum
  nesting bump n+1→n+2.
- lib.rs: file-level extension nesting 1→2; register_types refs climb
  super::super::; emit_register_fn=false now also routes file-level
  extensions into __ext.rs (fixes a latent wrong-nesting path);
  generate_module_tree wraps the three kind mods in pub mod __buffa.
- buffa-build generate_include_file: same wrapping.

Consumer updates: include_generated! macro shape in buffa-test,
conformance, benchmarks; explicit includes in buffa-types; sed-based
::view::/::oneofs::/::ext:: → ::__buffa::… path rewrite across tests,
examples, benches; clippy::module_inception added to allow lists for
the message-View-in-view-tree case.

Regression tests:
- naming.rs: message View { Inner } generates cleanly (rpb repro 1);
  message named __buffa rejected with ReservedName.
- buffa-build: package foo + package foo.view emits one
  pub mod view at foo:: depth (rpb repro 2).
- prelude_shadow.proto: end-to-end message View { Inner } compile.

Regenerated WKT, bootstrap, and logging-example checked-in code.
@iainmcgin iainmcgin force-pushed the prototype/codegen-namespacing branch from ae7e632 to c571068 Compare April 23, 2026 21:33
@rpb-ant
Copy link
Copy Markdown
Contributor

rpb-ant commented Apr 23, 2026

The __buffa wrapper closes the collision hole, nice. Three things I think are still worth doing:

  1. register_types is still per-file. Moving it under __buffa::ext:: doesn't change that 7 WKT protos would emit 7 fns into one module; the emit_register_fn=false hack in gen_wkt_types.rs is still load-bearing. If whichever stitcher emits the __buffa wrapper also emitted one merged register_types per package, that goes away.

  2. The pub mod __buffa { ... } shape gets rendered in three places (generate_module_tree, buffa-build's generate_include_file, the packaging plugin). Probably want one of those canonical and the others delegating, otherwise adding a kind means touching all three plus the hand-rolled copies in conformance/buffa-test/buffa-types/benchmarks.

  3. Related to 2: a buffa::include_proto!("pkg") macro would let the hand-rolled consumers stop copy-pasting the wrapper. codegen: namespace ancillary types under buffa_:: sentinel + per-package stitcher #62 has codegen emit a <pkg>.mod.rs per package and the macro just includes that, pretty cheap.

(I'd thought the file-level enum reservation was a gap too but check_siblings already covers it, my bad.)

Also we should just pick __buffa vs buffa_ and oneofs vs oneof and converge, the two PRs are basically a rename apart at this point.

iainmcgin added a commit that referenced this pull request Apr 23, 2026
…age stitcher (#62)

## Summary

Ancillary types emitted by `buffa-codegen` — view structs, oneof enums,
oneof view enums, extension consts, and `register_types` — currently
share the package-level Rust namespace with user proto types.
Suffix-based disambiguation (`FooView`, `KindOneof`) handles most cases
but still rejects pathological inputs (`message FooView` next to
`message Foo`; nested type literally named `{X}Oneof` next to `oneof
{x}`) with a codegen error.

This PR moves all ancillary kinds under a single reserved sentinel
module `__buffa::` per package, making collisions structurally
impossible without name mangling, and has codegen emit a per-package
`<pkg>.mod.rs` stitcher so consumers integrate via one
`buffa::include_proto!("pkg")` call instead of hand-authoring the module
wrapper.

Supersedes #54 and #37. Closes #32.

## Layout

| Item | Before | After |
|---|---|---|
| Owned message | `pkg::Foo` | `pkg::Foo` (unchanged) |
| View struct | `pkg::FooView` | `pkg::__buffa::view::FooView` |
| Nested view | `pkg::foo::BarView` | `pkg::__buffa::view::foo::BarView`
|
| Oneof enum | `pkg::foo::KindOneof` | `pkg::__buffa::oneof::foo::Kind`
|
| View-of-oneof | `pkg::foo::KindOneofView` |
`pkg::__buffa::view::oneof::foo::Kind` |
| Extension const | `pkg::FOO` | `pkg::__buffa::ext::FOO` |
| Registration fn | `pkg::register_types` (per file) |
`pkg::__buffa::register_types` (per package) |

The sentinel `__buffa` is the **only** name codegen reserves in user
namespace. It aligns with the existing `__buffa_` reserved-field-name
prefix (`__buffa_cached_size`, `__buffa_unknown_fields`), so the rule is
uniformly "anything starting `__buffa` is reserved." `validate_file`
errors with `CodeGenError::ReservedModuleName` if any proto package
segment, message, or file-level enum would produce a `__buffa`-named
module/type. Adding future ancillary kinds nests them under the existing
sentinel; no new reserved words.

**No convenience re-exports.** Short-path aliases (`pub use
__buffa::view::*`) are intentionally not emitted — they would work in
the common case but silently change meaning when a user proto shadows
them. The canonical `__buffa::` path is unconditional.

## Per-package `.mod.rs` stitcher

Codegen now emits, in addition to the five per-proto content files
(`<stem>.rs`, `<stem>.__view.rs`, `<stem>.__oneof.rs`,
`<stem>.__view_oneof.rs`, `<stem>.__ext.rs`), one `<dotted.pkg>.mod.rs`
per package containing the `pub mod __buffa { … }` wrapper and a single
merged `register_types` body. Consumers reference only the stitcher:

```rust
pub mod my_pkg {
    buffa::include_proto!("my.pkg");  // → include!(OUT_DIR/my.pkg.mod.rs)
}
```

Per-proto content files remain 1:1 with input protos (split-invocation
safe for build systems that compile package subsets independently). The
stitcher is the only per-package artifact and has the same caveat as any
generated include file.

This collapses the wrapper-authoring previously duplicated across
`buffa-build`, `buffa-types`, `buffa-test`, `conformance`, examples,
benchmarks, `gen_lib_rs.py`, and `protoc-gen-buffa-packaging` to one
place in codegen (the canonical `ALLOW_LINTS` list lives there too).
`buffa-build::generate_include_file` now delegates to
`buffa_codegen::generate_module_tree`.

## `register_types` per-package

Because the stitcher is per-package, `register_types` is naturally one
fn per package. The `emit_register_fn=false` workaround in
`gen_wkt_types.rs` (seven WKT files × one fn each colliding in
`google.protobuf`) is no longer needed.

## Deleted

- `CodeGenError::OneofNameConflict` and `::ViewNameConflict` —
collisions are structurally impossible.
- `proto_path_to_rust_module` is `#[deprecated]` (consumers should use
the stitcher path); follow-up will remove it.

## Test plan

- [x] `task lint` — fmt + clippy clean (verified against rustc 1.95)
- [x] `task lint-md` — markdownlint clean
- [x] `task test` — 1462 tests pass
- [x] `task verify` — 6× `CONFORMANCE SUITE PASSED`, baseline numbers
exact
- [x] `task stress-googleapis` — 47 protos compile clean
- [x] examples (addressbook/envelope/logging), benchmarks — compile
clean
- [x] regression tests for `message View { message Inner {} }` (codegen
unit + `prelude_shadow.proto` end-to-end) and `package foo.view;`
- [x] reservation tests for `package foo.__buffa;`, `message __Buffa
{}`, and `enum __buffa {}`

## Breaking changes

API-breaking for downstream consumers of generated code (import paths
for views, oneofs, extensions, `register_types`). Migration is
mechanical per the layout table above. Pre-1.0.

---------

Co-authored-by: Iain McGinniss <309153+iainmcgin@users.noreply.github.com>
@iainmcgin
Copy link
Copy Markdown
Collaborator Author

[claude code] Superseded by #62 — same goal, cleaner architecture (per-package .mod.rs stitcher + include_proto! macro, single canonical ALLOW_LINTS, merged per-package register_types). The reserved-name validator and collision regression tests from this branch were ported into #62.

@iainmcgin iainmcgin closed this Apr 23, 2026
auto-merge was automatically disabled April 23, 2026 22:39

Pull request was closed

@github-actions github-actions Bot locked and limited conversation to collaborators Apr 23, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants