Speed up Convert.ToBase64String() by SergeiPavlov · Pull Request #69884 · dotnet/runtime

SergeiPavlov · 2022-05-26T23:50:20Z

It is 30% performance optimization:

|                                      Method |     Mean | Error |      Min |      Max |   Gen 0 | Allocated |
|-------------------------------------------- |---------:|------:|---------:|---------:|--------:|----------:|
|                  Old_Convert_ToBase64String | 540.4 us |    NA | 540.4 us | 540.4 us |  9.0000 |      1 MB |
| Old_Convert_ToBase64String_InsertLineBreaks | 634.3 us |    NA | 634.3 us | 634.3 us | 10.0000 |      1 MB |
|                  New_Convert_ToBase64String | 396.4 us |    NA | 396.4 us | 396.4 us |  9.0000 |      1 MB |
| New_Convert_ToBase64String_InsertLineBreaks | 425.9 us |    NA | 425.9 us | 425.9 us |  9.0000 |      1 MB |

Benchmarked on 1000 random byte arrays of lengths 0..999

Used tricks:

Use precomputed table of Base64 char pairs (takes 16 KiB RAM and some warmup time to initialize (once) and load to CPU Cache). It reduces number of memory operations 2 times. Use int to process two char values simultaneously.
Avoid reading each byte from inData array twice.
More fast pointer arithmetic: *ptr++ instead of ptrBase[index]; ++index;. We are saving add instruction.
Get rid of offset parameter from ConvertToBase64Array(). inData is array base + offset.
Optimize most critical loop: check only one condition inside it. insertLineBreaks impacts on number of loop rounds before next complex condition check.
Consider BitConverter.IsLittleEndian for correct work on Big-endian platforms.

ghost · 2022-05-26T23:50:26Z

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

ghost · 2022-05-29T06:58:42Z

Tagging subscribers to this area: @dotnet/area-system-text-encoding
See info in area-owners.md if you want to be subscribed.

Issue Details

It is 30% performance optimization:

|                                      Method |     Mean | Error |      Min |      Max |   Gen 0 | Allocated |
|-------------------------------------------- |---------:|------:|---------:|---------:|--------:|----------:|
|                  Old_Convert_ToBase64String | 540.4 us |    NA | 540.4 us | 540.4 us |  9.0000 |      1 MB |
| Old_Convert_ToBase64String_InsertLineBreaks | 634.3 us |    NA | 634.3 us | 634.3 us | 10.0000 |      1 MB |
|                  New_Convert_ToBase64String | 396.4 us |    NA | 396.4 us | 396.4 us |  9.0000 |      1 MB |
| New_Convert_ToBase64String_InsertLineBreaks | 425.9 us |    NA | 425.9 us | 425.9 us |  9.0000 |      1 MB |

Benchmarked on 1000 random byte arrays of lengths 0..999

Used tricks:

Use precomputed table of Base64 char pairs (takes 16 KiB RAM and some warmup time to initialize (once) and load to CPU Cache). It reduces number of memory operations 2 times. Use int to process two char values simultaneously.
Avoid reading each byte from inData array twice.
More fast pointer arithmetic: *ptr++ instead of ptrBase[index]; ++index;. We are saving add instruction.
Get rid of offset parameter from ConvertToBase64Array(). inData is array base + offset.
Optimize most critical loop: check only one condition inside it. insertLineBreaks impacts on number of loop rounds before next complex condition check.
Consider BitConverter.IsLittleEndian for correct work on Big-endian platforms.

Author:	SergeiPavlov
Assignees:	-
Labels:	`area-System.Text.Encoding`, `community-contribution`
Milestone:	-

Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com>

deeprobin · 2022-05-30T04:53:01Z

-                        j += 4;
+                        a = *inData++;
+                        b = *inData++;
+                        *outPairs++ = base64Pairs[(a << 4) | (b >> 4)];


@tannergooding
Why is the codegen soo different to LLVM? https://godbolt.org/z/T8eqE9chv

LLVM

shr sil, 4 shl dil, 4 or dil, sil movzx eax, dil ret

.NET JIT

movzx rax, dil shl eax, 4 movzx rdi, sil sar edi, 4 or eax, edi ret

@deeprobin feel free to file an issue. Related one is #13816 (you can use PR that closed it as a foundation for your PR if you want to contribute 🙂)

EgorBo · 2022-05-30T13:08:38Z

There was a PR to use SSE for this API - dotnet/coreclr#21833

stephentoub · 2022-05-30T13:54:44Z

There was a PR to use SSE for this API - dotnet/coreclr#21833

Why didn't we finish it?

EgorBo · 2022-05-30T15:42:26Z

There was a PR to use SSE for this API - dotnet/coreclr#21833

Why didn't we finish it?

It got lost during coreclr->runtime migration 🙂 I can port it to crossplat intrinsics after(if) this PR lands to avoid conflicts

stephentoub · 2022-05-31T14:17:48Z

It got lost during coreclr->runtime migration 🙂 I can port it to crossplat intrinsics after(if) this PR lands to avoid conflicts

A quick skim of that PR suggests it doesn't require a 16K lookup table? If that's the case, with appreciation to Sergei, I'd prefer we just start with that PR as something that's faster and smaller.

SergeiPavlov · 2022-05-31T23:05:02Z

A quick skim of that PR suggests it doesn't require a 16K lookup table? If that's the case, with appreciation to Sergei, I'd prefer we just start with that PR as something that's faster and smaller.

I agree, vector intrinsics are always preferable. This 16K-price implementation may be fallback for platforms without SSE/AVX-like instructions.

stephentoub · 2022-06-29T16:16:04Z

@EgorBo, are you still going to look at bringing that PR back to life?

EgorBo · 2022-06-29T16:22:24Z

@EgorBo, are you still going to look at bringing that PR back to life?

Sure, will take a look in an hour

EgorBo · 2022-06-29T23:00:13Z

@EgorBo, are you still going to look at bringing that PR back to life?

Sure, will take a look in an hour

So I took a look - there was a small bug in the impl, but the problem that it won't be trivial to extend it to support Arm - it has to many non-shared intrinsics, I'll try to port it in coming days

stephentoub · 2022-07-22T18:34:46Z

@EgorBo, @SergeiPavlov, can this be closed now that @EgorBo's change went in?

SergeiPavlov · 2022-07-22T21:47:09Z

Yes.

The function is optimized by a959c3e

stephentoub · 2022-07-22T21:47:39Z

Thanks, @SergeiPavlov.

Speed up Convert.ToBase64String()

7421802

ghost added the community-contribution Indicates that the PR has been added by a community member label May 26, 2022

Revert filemode of build.cmd

37a740e

jeffhandley reviewed May 29, 2022

View reviewed changes

Comment thread src/libraries/System.Private.CoreLib/src/System/Convert.cs Outdated

Comment thread src/libraries/System.Private.CoreLib/src/System/Convert.cs Outdated

jeffhandley added the area-System.Text.Encoding label May 29, 2022

SergeiPavlov and others added 2 commits May 29, 2022 00:38

Update src/libraries/System.Private.CoreLib/src/System/Convert.cs

6195b95

Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com>

Fix typo

1582156

deeprobin reviewed May 30, 2022

View reviewed changes

deeprobin mentioned this pull request May 30, 2022

Codegen optimization of (a << 4) | (b >> 4) #69983

Closed

runfoapp bot mentioned this pull request May 30, 2022

jit.1 work item failing on mono #67888

Closed

SergeiPavlov closed this Jul 22, 2022

ghost locked as resolved and limited conversation to collaborators Aug 22, 2022

Conversation

SergeiPavlov commented May 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented May 26, 2022

Uh oh!

Uh oh!

Uh oh!

ghost commented May 29, 2022

Uh oh!

Uh oh!

Uh oh!

deeprobin May 30, 2022

Choose a reason for hiding this comment

LLVM

.NET JIT

Uh oh!

EgorBo May 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EgorBo commented May 30, 2022

Uh oh!

stephentoub commented May 30, 2022

Uh oh!

EgorBo commented May 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentoub commented May 31, 2022

Uh oh!

SergeiPavlov commented May 31, 2022

Uh oh!

stephentoub commented Jun 29, 2022

Uh oh!

EgorBo commented Jun 29, 2022

Uh oh!

EgorBo commented Jun 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentoub commented Jul 22, 2022

Uh oh!

SergeiPavlov commented Jul 22, 2022

Uh oh!

stephentoub commented Jul 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

SergeiPavlov commented May 26, 2022 •

edited

Loading

EgorBo May 30, 2022 •

edited

Loading

EgorBo commented May 30, 2022 •

edited

Loading

EgorBo commented Jun 29, 2022 •

edited

Loading