Skip to content

Conversation

@ahamlat
Copy link
Contributor

@ahamlat ahamlat commented Sep 7, 2025

PR description

This PR goal is to improve the performances of a modExp worst case scenario from EEST tests. The input of that specific use case was added to Mod exp benchmarks. The last column in the table below shows the improvement from the PR.

      if mod length > 512 then native if base length > 256 then native if base even execute native
  current with JAVA default (MGps) current with native (MGps) default with above modif (MGps) default with above modif (MGps) default with above modif (MGps)
eip_example1 238.43 129.87 202.52 194.10 237.27
eip_example2 2,687.97 150.83 4,826.09 4,514.50 149.11
even-modulus-1 12,778.72 14,077.14 12,962.96 12,386.83 15,047.74
even-modulus-2 1,025.67 713.77 1,011.18 1,010.85 1,023.59
even-modulus-3 17,546.45 11,865.04 18,628.98 17,232.66 17,983.49
nagydani-1-square 164.39 155.68 160.05 148.30 140.83
nagydani-1-qube 170.68 147.01 150.24 150.19 155.30
nagydani-1-pow0x10001 415.44 342.48 395.56 398.68 405.97
nagydani-2-square 87.87 30.01 83.66 81.83 85.09
nagydani-2-qube 81.50 56.58 76.44 78.31 79.27
nagydani-2-pow0x10001 672.43 443.04 658.79 487.41 647.94
nagydani-3-square 156.45 74.76 148.02 74.93 150.50
nagydani-3-qube 140.61 68.76 133.53 68.84 135.47
nagydani-3-pow0x10001 955.40 493.55 933.16 494.02 939.51
nagydani-4-square 208.11 76.54 76.58 75.91 208.88
nagydani-4-qube 190.20 69.63 69.79 69.55 187.51
nagydani-4-pow0x10001 1,122.03 511.29 510.82 483.21 1,117.35
nagydani-5-square 275.29 78.29 78.34 78.24 276.52
nagydani-5-qube 232.15 71.16 71.34 71.18 232.19
nagydani-5-pow0x10001 1,217.32 518.12 519.19 513.90 1,217.39
marius-1-even 1,084.99 778.87 1,086.64 1,085.26 1,104.15
guido-1-even 1,519.89 1,075.06 1,545.78 1,543.83 1,530.40
guido-2-even 1,010.26 722.22 1,012.78 1,004.88 1,029.98
guido-3-even 150.15 150.22 149.83 150.46 149.93
guido-4-even 142,978.03 1,118.91 151,736.86 145,204.76 149,519.22
guido-4-even-nethermind 33,842.48 911.66 28,484.92 23,480.66 27,076.90
marcin-1-base-heavy 126.94 65.11 120.61 118.61 123.22
marcin-1-exp-heavy 47,677.50 897.33 48,221.57 50,060.68 44,903.76
marcin-modexp-215gas-exp-heavy 137,141.73 1,115.10 131,989.33 133,333.15 139,217.80
marcin-1-balanced 376.71 189.85 370.86 375.09 374.33
marcin-2-base-heavy 177.43 70.26 152.77 70.68 178.25
marcin-2-exp-heavy 47,479.81 523.35 46,576.22 46,297.84 45,976.37
marcin-2-balanced 629.70 305.59 628.48 632.85 633.45
marcin-3-base-heavy 241.71 112.25 236.21 242.00 114.59
marcin-3-exp-heavy 13,393.96 201.33 12,948.60 12,559.72 12,260.94
marcin-3-balanced 259.94 116.81 258.89 262.26 118.99
pawel-1-exp-heavy 1,526.16 927.92 1,526.56 1,551.62 1,519.60
pawel-2-exp-heavy 541.55 342.70 541.97 546.45 541.98
pawel-3-exp-heavy 299.25 174.69 303.25 302.29 301.71
pawel-4-exp-heavy 204.18 110.80 203.41 204.79 203.02
eest-guido-3-even 42.76 149.50 150.02 150.06 149.80

Fixed Issue(s)

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests
  • hive tests: Engine or other RPCs modified?

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
@ahamlat ahamlat requested review from lu-pinto and siladu and removed request for lu-pinto September 7, 2025 18:44
@ahamlat ahamlat marked this pull request as ready for review September 8, 2025 06:16
Copy link
Contributor

@siladu siladu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your table it's not clear to me what the actual improvement is.

For each test case, I am not sure whether the current behaviour will use native or Java, following @lu-pinto's original optimisation.

Also, why are the mod and base lengths significant for this PR?

final int modulusLength = clampedToInt(length_of_MODULUS);
if ((extractLastByte(input, baseOffset, baseLength) & 1) != 1
&& (extractLastByte(input, modulusOffset, modulusLength) & 1) != 1) {
if ((extractLastByte(input, baseOffset, baseLength) & 1) != 1) {
Copy link
Contributor

@siladu siladu Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To double check my understanding: this PR changes the code from:
"If the least significant byte of both the base and the modulus is even, then use native"
to
"If the least significant byte of the base is even, then use native"

...so we're using native in more cases than we were before?

How does that relate to the underlying algorithm? ...I guess I'm wondering why the extra restriction for even modulus was originally added cc @lu-pinto ?

Copy link
Contributor Author

@ahamlat ahamlat Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your understanding is correct. Let’s confirm with @lu-pinto, but from what I gathered, he seems to agree with this implementation.

Copy link
Member

@lu-pinto lu-pinto Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to avoid going through the other heavy branch full of BigInteger operations that generate confetti objects: https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/math/BigInteger.java#L3003-L3004

Odd modulus makes code go through the Montegomery algorithm, so I thought it would make a difference. Apparently this case invalidates that assumption somehow. I haven't studied it as thoroughly, but this change seems to normalize results which is good for the Ethereum case.
There could be a better routing strategy, or just code up a Java impl that gets rid of all the new objects - this would be my preferred approach because JNA has a cost.

@ahamlat
Copy link
Contributor Author

ahamlat commented Sep 9, 2025

From your table it's not clear to me what the actual improvement is.

It is in the description but I agree that it is not very clear "The last column in the table below shows the improvement from the PR".

Also, why are the mod and base lengths significant for this PR?

I was looking for heuristics based on the input of new found use case. I found from the benchmarks that native implementation is better for that specific use case, so I tested 3 strategies. The last one is the best as it keeps better mgas/s on other use cases.

Copy link
Member

@lu-pinto lu-pinto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say let's ship this and reiterate after

@siladu
Copy link
Contributor

siladu commented Sep 9, 2025

The last column in the table below shows the improvement from the PR

I saw that, but what's not clear is what the "before" value is, since both native and java values are listed rather than the one that the benchmark actually uses based on the input...
e.g. for nagydani-1-square, was the "default" benchmark using Java or native before? The PR has worse performance than both variants for this case.

Copy link
Contributor

@siladu siladu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed with @ahamlat that "current with JAVA default (MGps)" column is comparable with the last column "if base even execute native" since both are switching between Java and native based on input.

@ahamlat
Copy link
Contributor Author

ahamlat commented Sep 9, 2025

I think you can forget these columns "if mod length > 512 then native" and "if base length > 256 then native". These are attempts that are not implemented in this PR as they have some regressions

The PR has worse performance than both variants for this case.

There're only two implementation of ModExp, either Java, either Native. We use the heuristics to decice which one to execute. Currently in besu, there is no way to force all ModExp executions to JAVA. So even in the first columns, some executions are done with native implementation because the existing heuristics.
With this PR (last column), we redirect more executions to native. So we still use Java as default but with the existing heuristics modified to make the last (new worst) case execute in native.
So the PR executes the last use with native, the same as "current with native (MGps)" column.

@ahamlat ahamlat merged commit 6344f17 into hyperledger:main Sep 9, 2025
46 checks passed
georgereuben pushed a commit to georgereuben/besu that referenced this pull request Sep 16, 2025
* Improve EEST mod_vul_guido_3_even use case

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: georgereuben <reubengeorge101@gmail.com>
jflo pushed a commit to jflo/besu that referenced this pull request Oct 13, 2025
* Improve EEST mod_vul_guido_3_even use case

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: jflo <justin+github@florentine.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants