Fix GeneratedRegex fixer to preserve multiline verbatim string patterns#120624
Merged
stephentoub merged 9 commits intomainfrom Oct 25, 2025
Merged
Fix GeneratedRegex fixer to preserve multiline verbatim string patterns#120624stephentoub merged 9 commits intomainfrom
stephentoub merged 9 commits intomainfrom
Conversation
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix GeneratedRegex fixer to preserve pattern formatting
Fix GeneratedRegex fixer to preserve multiline verbatim string patterns
Oct 11, 2025
stephentoub
reviewed
Oct 11, 2025
src/libraries/System.Text.RegularExpressions/gen/UpgradeToGeneratedRegexCodeFixer.cs
Outdated
Show resolved
Hide resolved
stephentoub
approved these changes
Oct 11, 2025
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR fixes the GeneratedRegex code fixer to preserve multiline verbatim string patterns instead of converting them to single-line strings with escape sequences. The fix addresses readability issues for regex patterns that use RegexOptions.IgnorePatternWhitespace, particularly in complex scenarios with string concatenation.
- Added logic to detect patterns containing newlines or carriage returns in addition to backslashes
- Enhanced pattern handling to preserve verbatim string syntax for better readability
- Added comprehensive test coverage for both direct multiline patterns and string concatenation scenarios
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/libraries/System.Text.RegularExpressions/gen/UpgradeToGeneratedRegexCodeFixer.cs |
Modified the code fixer to detect newlines/carriage returns and preserve verbatim string syntax |
src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/UpgradeToGeneratedRegexAnalyzerTests.cs |
Added two comprehensive tests to verify multiline verbatim string preservation |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Member
|
@copilot, please fix the test failures: |
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Contributor
Author
... Fixed in commit 1ce8e01. The tests have been updated to use the property syntax ( |
stephentoub
approved these changes
Oct 23, 2025
Open
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix for GeneratedRegex fixer preserving multiline patterns
This PR addresses the issue where the GeneratedRegex code fixer converts multiline verbatim string literals into single-line strings with escape sequences, losing readability.
Problem
When a regex pattern contains actual newlines (from verbatim string literals or string concatenation), the fixer converts them to escape sequences like
\r\n, making patterns withRegexOptions.IgnorePatternWhitespaceunreadable.Solution
UpgradeToGeneratedRegexCodeFixer.csto detect newlines in pattern stringsChanges Made
ShouldUseVerbatimStringhelper method that checks for backslashes, newlines (\n), or carriage returns (\r) usingIndexOfAnyfor optimal performanceGetNodemethod to use verbatim string syntax when any of these characters are presentMultilineVerbatimStringPreservedByFixerMultilineStringConcatenationPreservedByFixer{ get; }) instead of method syntax to match current codebaseTest Results
✅ All analyzer tests pass (116 tests)
✅ New tests specifically validate multiline pattern preservation
✅ No build warnings or errors
✅ Works for both direct verbatim strings and string concatenation scenarios
Impact
This fix greatly improves readability for regex patterns with
RegexOptions.IgnorePatternWhitespace, particularly for complex patterns in projects like MSBuild that use string fragment composition. The fix preserves the original formatting intent from the source code.Original prompt
This section details on the original issue you should resolve
<issue_title>GeneratedRegex fixer forces pattern onto a single line</issue_title>
<issue_description>
Consider
run the fixer, now I have
The semantics are the same, but the readability is gone. I would expect
I see #69616 which implies that whitespace is preserved but not comments. I don't see whitespace preserved, or at least not in the original form.</issue_description>
Comments on the Issue (you are @copilot in this section)
@ Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions See info in [area-owners.md](https://github.com/dotnet/runtime/blob/main/docs/area-owners.md) if you want to be subscribed.Issue Details
run the fixer, now I have
The semantics are the same, but the readability is gone. I would expect
I see #69616 which implies that whitespace is preserved but not comments. I don't see whitespace preserved, or at least not in the original form.
area-System.Text.RegularExpressionsUnfortunately many of the dotnet/msbuild regexes are built up by compounding reused string fragments, which in some cases are compounded other ones, and use IgnorePatternWhitespace. This means after running the generator, they need to be fixed by hand.</comment_new>
Fixes #79891
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.