Update EncodingTable to use ConcurrentDictionary for performance#125001
Update EncodingTable to use ConcurrentDictionary for performance#125001stephentoub merged 4 commits intomainfrom
Conversation
…EncodingTable Replace the Dictionary<K,V> + ReaderWriterLockSlim pattern with ConcurrentDictionary<K,V> in System.Text.Encoding.CodePages's EncodingTable.cs. ConcurrentDictionary provides a lock-free fast read path via TryGetValue, avoiding the overhead of acquiring a ReaderWriterLockSlim for every cache lookup. - s_nameToCodePageCache, s_codePageToWebNameCache, s_codePageToEnglishNameCache, s_codePageToItemCache all changed to ConcurrentDictionary. - GetCodePageItem, GetCodePageFromName, GetNameFromCodePage simplified to use TryGetValue/TryAdd instead of lock-based patterns. - Added System.Collections.Concurrent project reference for net11.0+ targets. Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
@copilot use the performance benchmark skill to validate this change |
Running benchmark via @EgorBot. The benchmark exercises the internal @EgorBot -linux_amd -osx_arm64 using System.Text;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);
[MemoryDiagnoser]
public class Bench
{
// Encoding names that exercise EncodingTable.GetCodePageFromName -> cache lookup
private static readonly string[] EncodingNames = new[]
{
"shift_jis", "big5", "gb2312", "euc-kr", "iso-8859-2",
"windows-1252", "windows-1251", "windows-1250", "windows-874", "ibm437"
};
private Encoding[] _cachedEncodings = default!;
private EncodingProvider _provider = default!;
[GlobalSetup]
public void Setup()
{
Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
_provider = CodePagesEncodingProvider.Instance;
// Pre-populate internal caches by resolving each encoding once
_cachedEncodings = new Encoding[EncodingNames.Length];
for (int i = 0; i < EncodingNames.Length; i++)
{
_cachedEncodings[i] = _provider.GetEncoding(EncodingNames[i])!;
}
}
// Hot path: repeated lookups by name (hits ConcurrentDictionary/Dictionary cache)
[Benchmark]
public int GetEncodingByName_Cached()
{
int sum = 0;
for (int i = 0; i < EncodingNames.Length; i++)
{
sum += _provider.GetEncoding(EncodingNames[i])!.CodePage;
}
return sum;
}
// Hot path: access properties that read from name/item caches
[Benchmark]
public int AccessEncodingProperties_Cached()
{
int sum = 0;
for (int i = 0; i < _cachedEncodings.Length; i++)
{
var enc = _cachedEncodings[i];
sum += enc.WebName.Length;
sum += enc.EncodingName.Length;
sum += enc.WindowsCodePage;
sum += enc.IsSingleByte ? 1 : 0;
}
return sum;
}
// Single lookup (hot path, most representative of typical usage)
[Benchmark]
public Encoding GetSingleEncoding_Cached()
{
return _provider.GetEncoding("shift_jis")!;
}
} |
src/libraries/System.Text.Encoding.CodePages/src/System/Text/EncodingTable.cs
Show resolved
Hide resolved
src/libraries/System.Text.Encoding.CodePages/src/System/Text/EncodingTable.cs
Outdated
Show resolved
Hide resolved
|
Tagging subscribers to this area: @dotnet/area-system-globalization |
There was a problem hiding this comment.
Pull request overview
This PR replaces the locking mechanism in System.Text.Encoding.CodePages's EncodingTable.cs for improved performance. The original code used four Dictionary instances protected by a single ReaderWriterLockSlim, where even reads required acquiring an upgradeable read lock. The replacement uses four ConcurrentDictionary instances, whose TryGetValue is lock-free on the read path (hot path), matching the pattern already established in System.Private.CoreLib's EncodingTable.
Changes:
- Replace
Dictionary<>fields +ReaderWriterLockSlimwithConcurrentDictionary<>fields and simplify all three cache-using methods (GetCodePageItem,GetCodePageFromName,GetNameFromCodePage) from nested lock patterns to straightforwardTryGetValue/TryAddcalls. - Add a
ProjectReferencetoSystem.Collections.ConcurrentforNetCoreAppCurrent-compatible targets, which is the correct gating for source-build project references.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
EncodingTable.cs |
Replaces Dictionary + ReaderWriterLockSlim with ConcurrentDictionary; removes ~50 lines of boilerplate locking code |
System.Text.Encoding.CodePages.csproj |
Adds System.Collections.Concurrent project reference for NetCoreAppCurrent-compatible builds |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
|
/ba-g the WASM build failure is unrelated to the change and should be already addressed by #125014 |
Replace
Dictionary+ReaderWriterLockSlimwithConcurrentDictionaryinSystem.Text.Encoding.CodePages'sEncodingTable.cs, matching the pattern already used inSystem.Private.CoreLib'sEncodingTable. This eliminates lock acquisition on the fast read path —ConcurrentDictionary.TryGetValueis lock-free.Description
Changes Made
EncodingTable.cs: Replace 4Dictionaryfields + 1ReaderWriterLockSlimwith 4ConcurrentDictionaryfields using target-typednew()expressions. SimplifyGetCodePageItem,GetCodePageFromName, andGetNameFromCodePagefrom nestedEnterUpgradeableReadLock/EnterWriteLockpatterns to straightforwardTryGetValue/TryAdd. All multi-lineifblocks use braces consistently..csproj: AddSystem.Collections.Concurrentproject reference for$(NetCoreAppCurrent)targets.linux_amdandosx_arm64, exercising hot cache-hit paths throughCodePagesEncodingProvider.GetEncoding(name)and encoding property accessors.Testing
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.