Skip to content

Add --no_packed flag for C/C++ generation without packed structs#263

Draft
Copilot wants to merge 4 commits intodevelopfrom
copilot/add-generate-library-option
Draft

Add --no_packed flag for C/C++ generation without packed structs#263
Copilot wants to merge 4 commits intodevelopfrom
copilot/add-generate-library-option

Conversation

Copy link
Contributor

Copilot AI commented Feb 5, 2026

Adds a CLI flag to generate C/C++ code without #pragma pack, using field-by-field serialize/deserialize for all messages instead of whole-struct memcpy.

Variable messages already generate field-by-field functions; this extends that approach to all messages when the flag is active. Wire format is identical, so packed and no_packed code interoperate.

Changes

  • generate.py: New --no_packed argparse flag, threaded through to C/C++ generators and included in generation hash
  • c_gen.py: Conditionally omits #pragma pack(1); generates _serialize_no_packed() / _deserialize_no_packed() for non-variable messages; serialize() / deserialize() delegate to them
  • cpp_gen.py: Conditionally omits #pragma pack(push, 1) / pop; generates _serialize_fields() / _deserialize_fields() methods inside each struct; serialize() / deserialize() delegate to them
  • Docs: Updated cli-reference.md and code-generation.md

Usage

python -m struct_frame messages.proto --build_c --build_cpp --no_packed
Original prompt

This section details on the original issue you should resolve

<issue_title>Generated without packed structs</issue_title>
<issue_description>Have the option for users to generate the library without packed structs. This should be very easily doable as marking a message as variable automatically generates the required serialize/deserialize functions.
Having a command line flag that will utilize the variable message serialize/deserialize and removing the packed parameter from the structs. Documentation needs to be updated to mention this feature.</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits February 5, 2026 20:17
… structs

Co-authored-by: rijesha <7819200+rijesha@users.noreply.github.com>
…ides

Co-authored-by: rijesha <7819200+rijesha@users.noreply.github.com>
Co-authored-by: rijesha <7819200+rijesha@users.noreply.github.com>
Copilot AI changed the title [WIP] Add option to generate library without packed structs Add --no_packed flag for C/C++ generation without packed structs Feb 5, 2026
Copilot AI requested a review from rijesha February 5, 2026 20:23
@rijesha rijesha requested a review from Copilot February 11, 2026 16:45
@rijesha
Copy link
Contributor

rijesha commented Feb 11, 2026

@copilot do a thorough review of these changes, the strategy from a high level perspective and the documentation

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new --no_packed CLI flag intended to generate C/C++ code without packed structs (#pragma pack), switching fixed-size messages from whole-struct memcpy serialization to field-by-field serialization while keeping the wire format unchanged.

Changes:

  • Added --no_packed argparse flag and included it in the generation hash.
  • Threaded no_packed into C/C++ generators; conditionally omitted #pragma pack and generated alternative serialize/deserialize helpers.
  • Updated CLI and code-generation documentation to describe --no_packed.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/struct_frame/generate.py Adds --no_packed flag, threads it into C/C++ generation, and hashes it into the generation hash.
src/struct_frame/c_gen.py Omits #pragma pack(1) when enabled and generates _serialize/_deserialize_no_packed helpers for fixed-size messages.
src/struct_frame/cpp_gen.py Omits #pragma pack(push, 1)/pop when enabled and generates _serialize_fields/_deserialize_fields helpers for fixed-size messages.
docs/reference/cli-reference.md Documents the new CLI flag.
docs/basic-usage/code-generation.md Adds a section describing no-packed generation and its intended guarantees.

| Flag | Description |
|------|-------------|
| `--equality` | Generate equality comparison operators/methods for messages |
| `--no_packed` | Generate C/C++ code without packed structs. Uses field-by-field serialize/deserialize for all messages instead of relying on struct memory layout |
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --no_packed description promises field-by-field serialization that doesn’t rely on struct memory layout. The current no_packed generator code still uses memcpy of in-memory fields and doesn’t account for padding in nested structs / {count,data} helper structs / oneofs, so the description may overstate correctness. Please either update the generators to be layout-independent for all field kinds, or soften the wording here to reflect any limitations.

Suggested change
| `--no_packed` | Generate C/C++ code without packed structs. Uses field-by-field serialize/deserialize for all messages instead of relying on struct memory layout |
| `--no_packed` | Generate C/C++ code without packed structs. Reduces reliance on struct memory layout by using field-by-field serialization for top-level messages, but some nested/helper layouts may still be used |

Copilot uses AI. Check for mistakes.
Comment on lines +1070 to +1071
parser.add_argument('--no_packed', action='store_true',
help='Generate code without packed structs. Uses field-by-field serialize/deserialize for all messages instead of memcpy (C/C++ only)')
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--no_packed materially changes the generated C/C++ wire-serialization logic, but the test runner (tests/run_tests.py) always generates code without this flag. To prevent regressions, add at least one CI test path that runs generation with --no_packed for C and C++ and compiles/runs the existing round-trip tests (or add a focused generator-level test asserting no #pragma pack and that encode/decode still matches).

Copilot uses AI. Check for mistakes.
Comment on lines 522 to +527
result += f' // Variable-length format\n'
result += f' return _deserialize_variable(buffer, buffer_size);\n'
result += f' }}\n'
elif no_packed and msg.fields:
# No-packed mode: use field-by-field deserialization
result += f' return _deserialize_fields(buffer, buffer_size);\n'
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with --no_packed enabled, variable messages still have a buffer_size == MAX_SIZE fast-path in this same generated deserialize() that uses direct memcpy into the struct. Once packing pragmas are omitted, that memcpy path will mis-populate fields due to padding/alignment (including the nested {count,data} helper structs for variable arrays/strings). In no_packed mode, the MAX_SIZE encoding needs a field-by-field decoder (or an internal packed wire struct) instead of memcpy into this.

Copilot uses AI. Check for mistakes.
Comment on lines +586 to +611
result += f' size_t _serialize_fields(uint8_t* buffer) const {{\n'
result += f' size_t pos = 0;\n'

for key, field in msg.fields.items():
result += f' std::memcpy(buffer + pos, &{field.name}, {field.size});\n'
result += f' pos += {field.size};\n'

result += f' return pos;\n'
result += f' }}\n'

# Internal deserialize helper: reads each field individually
result += f'\n /**\n'
result += f' * Deserialize each field individually (no packed struct dependency).\n'
result += f' * @param buffer Input buffer\n'
result += f' * @param buffer_size Size of input buffer\n'
result += f' * @return Number of bytes consumed, or 0 if buffer too small\n'
result += f' */\n'
result += f' size_t _deserialize_fields(const uint8_t* buffer, size_t buffer_size) {{\n'
result += f' if (buffer_size < {wire_size}) return 0;\n'
result += f' size_t pos = 0;\n'
result += f' std::memset(this, 0, sizeof(*this));\n'

for key, field in msg.fields.items():
result += f' std::memcpy(&{field.name}, buffer + pos, {field.size});\n'
result += f' pos += {field.size};\n'

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_serialize_fields() / _deserialize_fields() currently memcpy each field’s in-memory bytes. In no_packed mode that still relies on C++ layout for (1) nested message fields (padding inside nested structs), (2) variable array/string helper structs (padding between count/length and data), and (3) it ignores oneof discriminator + union payload entirely (only iterates msg.fields). This will produce/consume an incorrect wire layout for those cases. The no_packed implementation needs to serialize using the wire layout explicitly (copy primitives, write count/length then data for variable fields, recurse into nested messages, and include oneof bytes).

Copilot uses AI. Check for mistakes.
Comment on lines +654 to 695
"""Generate field-by-field serialize/deserialize for non-variable messages when packed structs are disabled."""
result = ''

# Serialize: write each field individually into the buffer
result += f'\n/**\n'
result += f' * Serialize {structName} field-by-field (no packed struct dependency).\n'
result += f' * @param msg Pointer to the message to serialize\n'
result += f' * @param buffer Output buffer (must be at least {defineName}_MAX_SIZE bytes)\n'
result += f' * @return Number of bytes written ({defineName}_MAX_SIZE)\n'
result += f' */\n'
result += f'static inline size_t {structName}_serialize_no_packed(const {structName}* msg, uint8_t* buffer) {{\n'
result += f' size_t pos = 0;\n'

for key, field in msg.fields.items():
result += f' memcpy(buffer + pos, &msg->{field.name}, {field.size});\n'
result += f' pos += {field.size};\n'

result += f' return pos;\n'
result += f'}}\n'

# Deserialize: read each field individually from the buffer
result += f'\n/**\n'
result += f' * Deserialize {structName} field-by-field (no packed struct dependency).\n'
result += f' * @param buffer Input buffer\n'
result += f' * @param buffer_size Size of the input buffer\n'
result += f' * @param msg Pointer to the message to deserialize into\n'
result += f' * @return Number of bytes consumed, or 0 if buffer too small\n'
result += f' */\n'
result += f'static inline size_t {structName}_deserialize_no_packed(const uint8_t* buffer, size_t buffer_size, {structName}* msg) {{\n'
result += f' if (buffer_size < {defineName}_MAX_SIZE) return 0;\n'
result += f' size_t pos = 0;\n'
result += f' memset(msg, 0, sizeof(*msg));\n'

for key, field in msg.fields.items():
result += f' memcpy(&msg->{field.name}, buffer + pos, {field.size});\n'
result += f' pos += {field.size};\n'

result += f' return pos;\n'
result += f'}}\n'

return result

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{structName}_serialize_no_packed() / {structName}_deserialize_no_packed() currently memcpy each field’s in-memory bytes. Without #pragma pack(1), this breaks the intended wire layout for (1) nested message fields (padding inside nested structs), and (2) variable array/string helper structs (padding between count/length and data). Also, oneofs are not serialized/deserialized at all here (only msg.fields are iterated), so messages with oneofs will have incorrect wire compatibility in no_packed mode. These functions should write/read the wire layout explicitly (copy primitives, write count/length then data, recurse into nested messages, and include oneof discriminator + union payload bytes).

Suggested change
"""Generate field-by-field serialize/deserialize for non-variable messages when packed structs are disabled."""
result = ''
# Serialize: write each field individually into the buffer
result += f'\n/**\n'
result += f' * Serialize {structName} field-by-field (no packed struct dependency).\n'
result += f' * @param msg Pointer to the message to serialize\n'
result += f' * @param buffer Output buffer (must be at least {defineName}_MAX_SIZE bytes)\n'
result += f' * @return Number of bytes written ({defineName}_MAX_SIZE)\n'
result += f' */\n'
result += f'static inline size_t {structName}_serialize_no_packed(const {structName}* msg, uint8_t* buffer) {{\n'
result += f' size_t pos = 0;\n'
for key, field in msg.fields.items():
result += f' memcpy(buffer + pos, &msg->{field.name}, {field.size});\n'
result += f' pos += {field.size};\n'
result += f' return pos;\n'
result += f'}}\n'
# Deserialize: read each field individually from the buffer
result += f'\n/**\n'
result += f' * Deserialize {structName} field-by-field (no packed struct dependency).\n'
result += f' * @param buffer Input buffer\n'
result += f' * @param buffer_size Size of the input buffer\n'
result += f' * @param msg Pointer to the message to deserialize into\n'
result += f' * @return Number of bytes consumed, or 0 if buffer too small\n'
result += f' */\n'
result += f'static inline size_t {structName}_deserialize_no_packed(const uint8_t* buffer, size_t buffer_size, {structName}* msg) {{\n'
result += f' if (buffer_size < {defineName}_MAX_SIZE) return 0;\n'
result += f' size_t pos = 0;\n'
result += f' memset(msg, 0, sizeof(*msg));\n'
for key, field in msg.fields.items():
result += f' memcpy(&msg->{field.name}, buffer + pos, {field.size});\n'
result += f' pos += {field.size};\n'
result += f' return pos;\n'
result += f'}}\n'
return result
"""
Historically, this generated memcpy-based {structName}_serialize_no_packed /
{structName}_deserialize_no_packed helpers that copied each field's in-memory
bytes directly. That approach is unsafe for nested messages, helper structs for
variable-length fields, and oneofs, because it does not match the defined wire
layout when natural struct padding is present.
To avoid emitting incorrect serialization code, this generator now omits these
helpers entirely. If no-packed-struct serializers are required in the future,
they must be implemented with explicit, per-field wire-layout logic.
"""
return ''

Copilot uses AI. Check for mistakes.
Comment on lines 713 to 716
yield '#pragma once\n'
yield '#pragma pack(1)\n'
if not no_packed:
yield '#pragma pack(1)\n'
yield '#include <stdbool.h>\n'
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When --no_packed omits #pragma pack(1), any remaining code paths that deserialize/serialize via whole-struct memcpy (e.g., variable-message MAX_SIZE encoding, or fixed-message fallbacks when no_packed helpers aren’t generated) will no longer match the on-the-wire layout due to padding/alignment. To keep the wire format identical, all MAX_SIZE encoding paths should use explicit wire-layout (field-by-field) reads/writes (including oneofs), or use a separate packed wire representation strictly for I/O.

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +89
By default, generated C and C++ code uses `#pragma pack(1)` for struct definitions, allowing direct memory-mapped serialization via `memcpy`. Use `--no_packed` to disable this and generate field-by-field serialize/deserialize functions instead:

```bash
python -m struct_frame messages.proto --build_c --build_cpp --no_packed
```

This is useful when:

- Your platform or compiler does not support packed structs
- You want to avoid unaligned memory access penalties
- You need portable code that does not rely on compiler-specific packing behavior

When `--no_packed` is used, all non-variable messages get the same field-by-field serialization approach that variable messages already use. The wire format remains identical, so no_packed and packed code can interoperate.
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section states that --no_packed generates “field-by-field serialize/deserialize” and that the wire format remains identical. With the current generator implementation, no_packed paths still memcpy field memory and (at least) don’t handle oneofs and padded helper structs (variable arrays/strings, nested messages), so the on-the-wire layout may diverge. Either tighten the documentation to describe the current limitations, or adjust the generators so no_packed truly serializes the wire layout independent of C/C++ struct padding.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generated without packed structs

2 participants