Skip to content

[wip] experiment with reversing alignment and size validation order#3133

Open
jswrenn wants to merge 1 commit intoGcf66958135e905f7cd7d2fac87d9f881e5e5185ffrom
G74bf0db1b7c30669171b5985ce24849a264746e7
Open

[wip] experiment with reversing alignment and size validation order#3133
jswrenn wants to merge 1 commit intoGcf66958135e905f7cd7d2fac87d9f881e5e5185ffrom
G74bf0db1b7c30669171b5985ce24849a264746e7

Conversation

@jswrenn
Copy link
Collaborator

@jswrenn jswrenn commented Mar 17, 2026


Latest Update: v6 — Compare vs v5

📚 Full Patch History

Links show the diff between the row version and the column version.

Version v5 v4 v3 v2 v1 Base
v6 vs v5 vs v4 vs v3 vs v2 vs v1 vs Base
v5 vs v4 vs v3 vs v2 vs v1 vs Base
v4 vs v3 vs v2 vs v1 vs Base
v3 vs v2 vs v1 vs Base
v2 vs v1 vs Base
v1 vs Base
⬇️ Download this PR

Branch

git fetch origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7 && git checkout -b pr-G74bf0db1b7c30669171b5985ce24849a264746e7 FETCH_HEAD

Checkout

git fetch origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7 && git checkout FETCH_HEAD

Cherry Pick

git fetch origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7 && git cherry-pick FETCH_HEAD

Pull

git pull origin refs/heads/G74bf0db1b7c30669171b5985ce24849a264746e7

Stacked PRs enabled by GHerrit.

Comment on lines +2 to +4
Instructions: 2200
Total Cycles: 783
Total uOps: 2500
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

Comment on lines +2 to +4
Instructions: 2100
Total Cycles: 673
Total uOps: 2200
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on optimizing the ref_from_bytes functions by reordering validation steps and tweaking assembly implementations. The goal is to enhance performance, particularly in scenarios involving dynamic padding and size calculations. The changes are primarily experimental and include updates to benchmark files to reflect performance analysis.

Highlights

  • Reordering Validation: This pull request experiments with reversing the order of alignment and size validation in the ref_from_bytes functions to potentially improve performance.
  • Assembly Optimization: Assembly code in several benchmark files was modified to optimize the ref_from_bytes functions, focusing on dynamic padding and size calculations.
  • Performance Benchmarking: The changes include updates to .mca files, reflecting performance measurements and resource pressure analysis after the assembly code modifications.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Iterations: 100
Instructions: 2000
Total Cycles: 682
Total Cycles: 683
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extremely marginal regression

Comment on lines +2 to +4
Instructions: 1300
Total Cycles: 405
Total uOps: 1300
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression?

Total Cycles: 338
Total uOps: 1100
Instructions: 1000
Total Cycles: 404
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression?

Comment on lines +2 to +4
Instructions: 2000
Total Cycles: 604
Total uOps: 2000
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

Comment on lines +2 to +4
Instructions: 2500
Total Cycles: 1008
Total uOps: 2800
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

Instructions: 2000
Total Cycles: 639
Total uOps: 2100
Total Cycles: 641
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marginal regression?

Comment on lines +2 to +4
Instructions: 2800
Total Cycles: 910
Total uOps: 3100
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

Comment on lines +2 to +4
Instructions: 1300
Total Cycles: 407
Total uOps: 1400
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

Iterations: 100
Instructions: 2300
Total Cycles: 791
Total Cycles: 797
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marginal regression?

Comment on lines +2 to +4
Instructions: 2500
Total Cycles: 1105
Total uOps: 2700
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request experiments with reordering the alignment and size validation checks in validate_cast_and_convert_metadata. While the logical correctness is maintained, the performance impact is mixed. Some benchmarks show significant improvements (e.g., ref_from_bytes_dynamic_size), while others show significant regressions (e.g., ref_from_suffix_with_elems_dynamic_padding). The original order performs the cheaper alignment check before the more expensive size validation, which is a good heuristic for failing fast. The new order might allow for better code generation in some cases by grouping checks differently, but it seems to hurt performance in others. Given the mixed results, it would be beneficial to understand the trade-offs better. Perhaps there's a way to get the benefits without the regressions.

Comment on lines +706 to +734
// Alignment checks go in their own block to avoid introducing variables
// into the top-level scope.
{
// We check alignment for `addr` (for prefix casts) or `addr +
// bytes_len` (for suffix casts). For a prefix cast, the correctness
// of this check is trivial - `addr` is the address the object will
// live at.
//
// For a suffix cast, we know that all valid sizes for the type are
// a multiple of the alignment (and by safety precondition, we know
// `DstLayout` may only describe valid Rust types). Thus, a
// validly-sized instance which lives at a validly-aligned address
// must also end at a validly-aligned address. Thus, if the end
// address for a suffix cast (`addr + bytes_len`) is not aligned,
// then no valid start address will be aligned either.
let offset = match cast_type {
CastType::Prefix => 0,
CastType::Suffix => bytes_len,
};

// Addition is guaranteed not to overflow because `offset <=
// bytes_len`, and `addr + bytes_len <= usize::MAX` is a
// precondition of this method. Modulus is guaranteed not to divide
// by 0 because `align` is non-zero.
#[allow(clippy::arithmetic_side_effects)]
if (addr + offset) % self.align.get() != 0 {
return Err(MetadataCastError::Alignment);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Moving the alignment check after the size validation has mixed performance results across the benchmarks. While some cases like ref_from_bytes_dynamic_size improve, others like ref_from_suffix_with_elems_dynamic_padding regress significantly.

The original order follows the fail-fast principle by performing the cheaper alignment check first. This reordering seems to be beneficial only in certain scenarios. It might be worth investigating what distinguishes the improved cases from the regressed ones to see if a more nuanced approach is possible, perhaps conditionally ordering the checks based on properties of the DstLayout.

@jswrenn jswrenn force-pushed the G74bf0db1b7c30669171b5985ce24849a264746e7 branch from ec58f2e to 799dad9 Compare March 17, 2026 19:37
@jswrenn jswrenn force-pushed the Gcf66958135e905f7cd7d2fac87d9f881e5e5185f branch 2 times, most recently from 67f28a5 to a08a5c9 Compare March 20, 2026 17:47
@jswrenn jswrenn force-pushed the G74bf0db1b7c30669171b5985ce24849a264746e7 branch from 799dad9 to 8bd6bea Compare March 20, 2026 17:47
@jswrenn jswrenn force-pushed the Gcf66958135e905f7cd7d2fac87d9f881e5e5185f branch from a08a5c9 to c30e455 Compare March 20, 2026 18:00
@jswrenn jswrenn force-pushed the G74bf0db1b7c30669171b5985ce24849a264746e7 branch from 8bd6bea to 0de1acc Compare March 20, 2026 18:00
gherrit-pr-id: G74bf0db1b7c30669171b5985ce24849a264746e7
@jswrenn jswrenn force-pushed the Gcf66958135e905f7cd7d2fac87d9f881e5e5185f branch from c30e455 to ac240ed Compare March 20, 2026 19:41
@jswrenn jswrenn force-pushed the G74bf0db1b7c30669171b5985ce24849a264746e7 branch from 0de1acc to f03564b Compare March 20, 2026 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant