Skip to content
This repository was archived by the owner on Jan 21, 2026. It is now read-only.

[Refactor] Provide a general zero-copy serial util#140

Merged
0oshowero0 merged 3 commits intodevfrom
han/refactor_serial
Dec 17, 2025
Merged

[Refactor] Provide a general zero-copy serial util#140
0oshowero0 merged 3 commits intodevfrom
han/refactor_serial

Conversation

@0oshowero0
Copy link
Member

@0oshowero0 0oshowero0 commented Dec 17, 2025

Summary by CodeRabbit

  • New Features

    • Public serialization and deserialization API functions now available for data handling operations.
    • Zero-copy serialization mode enabled to improve performance when supported by system configuration.
    • Full support for serializing nested tensor data structures.
  • Refactor

    • Consolidated internal serialization logic to improve code maintainability and consistency across modules.

✏️ Tip: You can customize this high-level summary in your review settings.

CC: @zhaohaidao

Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
Copilot AI review requested due to automatic review settings December 17, 2025 07:04
@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This pull request consolidates zero-copy serialization logic from zmq_utils.py into a centralized, reusable module in serial_utils.py. The changes introduce a public serialization API with optional nested tensor handling via a feature flag, and refactor zmq_utils.py to delegate to these shared utilities, eliminating code duplication.

Changes

Cohort / File(s) Summary
Serialization API and Zero-Copy Machinery
transfer_queue/utils/serial_utils.py
Introduces public serialization() and deserialization() functions; adds TQ_ZERO_COPY_SERIALIZATION feature flag; implements MsgpackEncoder and MsgpackDecoder for tensor/numpy handling; adds internal tensor processing (_process_tensor) and module-level logger; imports RPC pickler utilities for zero-copy path.
Refactored ZMQ Serialization
transfer_queue/utils/zmq_utils.py
Imports serialization and deserialization from serial_utils; removes local zero-copy machinery (MsgpackEncoder, MsgpackDecoder, _internal_rpc_pickler, HAS_RPC_PICKLER, TQ_ZERO_COPY_SERIALIZATION); updates ZMQMessage.serialize() and ZMQMessage.deserialize() to delegate to shared utilities.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Key areas requiring attention:
    • Tensor extraction and encoding logic in _process_tensor and MsgpackEncoder/MsgpackDecoder implementations
    • Feature flag (TQ_ZERO_COPY_SERIALIZATION) initialization and conditional behavior paths
    • Error handling and input validation in deserialization, especially for nested tensor reconstruction
    • Verification that delegated serialization/deserialization in zmq_utils.py maintains behavioral equivalence with the original implementation

Poem

🐰 Hops through the code with glee,
Zero-copy logic now set free,
Tensors bundled, bytes aligned,
Shared utilities—oh, how refined!
Less duplication, more to share,
A cleaner path through the data air!


Note

🎁 Summarized by CodeRabbit Free

Your organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login.

Comment @coderabbitai help to get the list of available commands and usage tips.

@0oshowero0
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors serialization and deserialization logic from zmq_utils.py into a general-purpose utility in serial_utils.py, making these functions reusable across the codebase for any object type (not just ZMQMessage).

Key Changes:

  • Extracted zero-copy serialization logic into standalone serialization() and deserialization() functions in serial_utils.py
  • Moved related imports and constants (TQ_ZERO_COPY_SERIALIZATION, encoder/decoder instances) to serial_utils.py
  • Simplified ZMQMessage.serialize() and ZMQMessage.deserialize() to use the new general utilities

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
transfer_queue/utils/zmq_utils.py Removed serialization/deserialization implementation and imports; now delegates to serial_utils.py functions
transfer_queue/utils/serial_utils.py Added general-purpose serialization() and deserialization() functions with zero-copy optimization support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.



def deserialization(obj: list[bytestr] | bytestr) -> TensorDict:
"""Deserialize an object from serialized data."""
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation format is inconsistent between serialization and deserialization functions. The serialization function has a detailed docstring explaining the return format for both modes, but the deserialization function only has a brief one-line docstring. Consider adding similar documentation explaining the expected input format and return value for consistency.

Copilot uses AI. Check for mistakes.
Signed-off-by: 0oshowero0 <o0shower0o@outlook.com>
@0oshowero0 0oshowero0 changed the title [Refactor] Provide a general serial util [Refactor] Provide a general zero-copy serial util Dec 17, 2025
@0oshowero0 0oshowero0 merged commit 57508a9 into dev Dec 17, 2025
3 checks passed
@0oshowero0 0oshowero0 deleted the han/refactor_serial branch December 17, 2025 09:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant