-
Notifications
You must be signed in to change notification settings - Fork 215
Support URLencoding during normalization #396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
af17cb7 to
6c12c74
Compare
6f73d37 to
e421bfd
Compare
commons-vfs2/src/main/java/org/apache/commons/vfs2/provider/UriParser.java
Outdated
Show resolved
Hide resolved
| private int cursor = 0; | ||
| private int lastSeparator; | ||
| private int end; | ||
| PathNormalizer(StringBuilder path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use final where you can.
garydgregory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @raboof
A few nits 😉
| boolean reading = true; | ||
| while (reading) { | ||
| reading = readNonSeparator(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler to say `while (readNonSeparator()); ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, it seems PMD won't let me
| } | ||
| return false; | ||
| } | ||
| private boolean readDot() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Add a blank line b/w methods.
commons-vfs2/src/main/java/org/apache/commons/vfs2/provider/UriParser.java
Outdated
Show resolved
Hide resolved
e421bfd to
8f33bb3
Compare
Earlier, the path would be URL-decoded after normalization. This meant some opportunities for normalization would be missed when dots or slashes were URL-encoded. One way to solve this would be to change the ordering of operations, but that seems risky as it would change the meaning of some of the abstractions. Because of that I opted for a more conservative approach, where normalization will take into account URL-encoded characters, but otherwise leave the input string intact.
8f33bb3 to
f1611bf
Compare
|
@raboof |
|
Ah, rebase mistake... thx for the fix, not sure what's going on with |
Might need some more reviewing to make sure this logic is correct
444ac4a to
31e0882
Compare
Run mvn before you push to catch build errors 😉 |
31e0882 to
ec42e65
Compare
01b8934 to
c95b48d
Compare
I know ;) - some of the tests are failing or very slow on my machine (even on the main branch), probably because I'm running in a sandbox, so I've been running individual targets - sorry about the noise ;) . Now getting 500's that are not our fault it looks like though, https://www.githubstatus.com/ Do you want me to squash the commits above? |
|
Don't worry about the noise 😉 |
|
Looks like we'll have to wait for GH to fix itself... |
|
Re-running builds now that GH is back up... |
The reverts apache#396 and related changes and implements the same in a simpler way by replacing the encoded characters already in `fixSeparators`. This approach has a slightly higher risk at breaking existing behaviour, but a lower risk of remaining problems in this part of the codebase. All testcases still succeed. This PR is intended to replace apache#543 and apache#555 This reverts commit cb45c94. This reverts commit 5399c76.
The reverts apache#396 and related changes and implements the same in a simpler way by replacing the encoded characters already in `fixSeparators`. This approach has a slightly higher risk at breaking existing behaviour, but a lower risk of remaining problems in this part of the codebase. All testcases still succeed. This PR is intended to replace apache#543 and apache#555. It includes the testcases from apache#543, adapted to the behaviour before apache#396. This reverts commit cb45c94. This reverts commit 5399c76. Co-Authored-By: Anthony Goubard <anthony.goubard@japplis.com>
The reverts apache#396 and related changes and implements the same in a simpler way by replacing the encoded characters already in `fixSeparators`. This approach has a slightly higher risk at breaking existing behaviour, but a lower risk of remaining problems in this part of the codebase. All testcases still succeed. This PR is intended to replace apache#543 and apache#555. It includes the testcases from apache#543, adapted to the behaviour before apache#396. This reverts commit cb45c94. This reverts commit 5399c76. Co-Authored-By: Anthony Goubard <anthony.goubard@japplis.com>
The reverts apache#396 and related changes and implements the same in a simpler way by replacing the encoded characters already in `fixSeparators`. This approach has a slightly higher risk at breaking existing behaviour, but a lower risk of remaining problems in this part of the codebase. All testcases still succeed. This PR is intended to replace apache#543 and apache#555. It includes the testcases from apache#543, adapted to the behaviour before apache#396. This reverts commit cb45c94. This reverts commit 5399c76. Co-Authored-By: Anthony Goubard <anthony.goubard@japplis.com>
* Simplify UriParser The reverts #396 and related changes and implements the same in a simpler way by replacing the encoded characters already in `fixSeparators`. This approach has a slightly higher risk at breaking existing behaviour, but a lower risk of remaining problems in this part of the codebase. All testcases still succeed. This PR is intended to replace #543 and #555. It includes the testcases from #543, adapted to the behaviour before #396. This reverts commit cb45c94. This reverts commit 5399c76. Co-Authored-By: Anthony Goubard <anthony.goubard@japplis.com> * Add benchmark for UriParser --------- Co-authored-by: Anthony Goubard <anthony.goubard@japplis.com>
Earlier, the path would be URL-decoded after normalization. This meant some opportunities for normalization would be missed when dots or slashes were URL-encoded.
One way to solve this would be to change the ordering of operations, but that seems risky as it would change the meaning of some of the abstractions. Because of that I opted for a more conservative approach, where normalization will take into account URL-encoded characters, but otherwise leave the input string intact.