-
Notifications
You must be signed in to change notification settings - Fork 1.9k
JS: add query js/regex/missing-regexp-anchor #1387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
98ae259
JS: refactor `IncompleteHostnameRegExp::regexp` to RegExp.qll
3358e49
JS: refactor the predicate `RegExp::regexp` to three classes.
69db54a
JS: add anchors to js/incomplete-hostname-regexp examples
0fa73b8
JS: add query js/regex/missing-regexp-anchor
3289c62
JS: address minor review comments
7018a38
JS: improve tests and regexp for js/regex/missing-regexp-anchor
1464427
JS: fix comment typo
7b65221
JS: address docstring comments
bf51c54
JS: add `RegExpPatternSource::getAParse` to hide the subclasses
9e0a97e
JS: address qhelp review comments
04868e5
JS: format qhelp examples
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
77 changes: 77 additions & 0 deletions
77
javascript/ql/src/Security/CWE-020/MissingRegExpAnchor.qhelp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| <!DOCTYPE qhelp PUBLIC | ||
| "-//Semmle//qhelp//EN" | ||
| "qhelp.dtd"> | ||
| <qhelp> | ||
|
|
||
| <overview> | ||
| <p> | ||
|
|
||
| Sanitizing untrusted input with regular expressions is a | ||
| common technique. However, it is error-prone to match untrusted input | ||
| against regular expressions without anchors such as <code>^</code> or | ||
| <code>$</code>. Malicious input can bypass such security checks by | ||
| embedding one of the allowed patterns in an unexpected location. | ||
|
|
||
| </p> | ||
|
|
||
| <p> | ||
|
|
||
| Even if the matching is not done in a security-critical | ||
| context, it may still cause undesirable behavior when the regular | ||
| expression accidentally matches. | ||
|
|
||
| </p> | ||
| </overview> | ||
|
|
||
| <recommendation> | ||
| <p> | ||
|
|
||
| Use anchors to ensure that regular expressions match at | ||
| the expected locations. | ||
|
|
||
| </p> | ||
| </recommendation> | ||
|
|
||
| <example> | ||
|
|
||
| <p> | ||
|
|
||
| The following example code checks that a URL redirection | ||
| will reach the <code>example.com</code> domain, or one of its | ||
| subdomains, and not some malicious site. | ||
|
|
||
| </p> | ||
|
|
||
| <sample src="examples/MissingRegExpAnchor_BAD.js"/> | ||
|
|
||
| <p> | ||
|
|
||
| The check with the regular expression match is, however, easy to bypass. For example | ||
| by embedding <code>example.com</code> in the path component: | ||
| <code>http://evil-example.net/example.com</code>, or in the query | ||
| string component: <code>http://evil-example.net/?x=example.com</code>. | ||
|
|
||
| Address these shortcomings by using anchors in the regular expression instead: | ||
|
|
||
| </p> | ||
|
|
||
| <sample src="examples/MissingRegExpAnchor_GOOD.js"/> | ||
|
|
||
| <p> | ||
|
|
||
| A related mistake is to write a regular expression with | ||
| multiple alternatives, but to only include an anchor for one of the | ||
| alternatives. As an example, the regular expression | ||
| <code>/^www\\.example\\.com|beta\\.example\\.com/</code> will match the host | ||
| <code>evil.beta.example.com</code> because the regular expression is parsed | ||
| as <code>/(^www\\.example\\.com)|(beta\\.example\\.com)/</code> | ||
|
|
||
| </p> | ||
| </example> | ||
|
|
||
| <references> | ||
| <li>MDN: <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions">Regular Expressions</a></li> | ||
| <li>OWASP: <a href="https://www.owasp.org/index.php/Server_Side_Request_Forgery">SSRF</a></li> | ||
| <li>OWASP: <a href="https://www.owasp.org/index.php/Unvalidated_Redirects_and_Forwards_Cheat_Sheet">XSS Unvalidated Redirects and Forwards Cheat Sheet</a>.</li> | ||
| </references> | ||
| </qhelp> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| /** | ||
| * @name Missing regular expression anchor | ||
| * @description Regular expressions without anchors can be vulnerable to bypassing. | ||
| * @kind problem | ||
| * @problem.severity warning | ||
| * @precision medium | ||
| * @id js/regex/missing-regexp-anchor | ||
| * @tags correctness | ||
| * security | ||
| * external/cwe/cwe-20 | ||
| */ | ||
|
|
||
| import javascript | ||
|
|
||
| /** | ||
| * Holds if `src` is a pattern for a collection of alternatives where | ||
| * only the first or last alternative is anchored, indicating a | ||
| * precedence mistake explained by `msg`. | ||
| * | ||
| * The canonical example of such a mistake is: `^a|b|c`, which is | ||
| * parsed as `(^a)|(b)|(c)`. | ||
| */ | ||
| predicate isInterestingSemiAnchoredRegExpString(RegExpPatternSource src, string msg) { | ||
| exists(string str, string maybeGroupedStr, string regex, string anchorPart, string escapedDot | | ||
| // a dot that might be escaped in a regular expression, for example `/\./` or `new RegExp('\\.')` | ||
| escapedDot = "\\\\[.]" and | ||
| // a string that is mostly free from special reqular expression symbols | ||
| str = "(?:(?:" + escapedDot + ")|[a-z:/.?_,@0-9 -])+" and | ||
| // the string may be wrapped in parentheses | ||
| maybeGroupedStr = "(?:" + str + "|\\(" + str + "\\))" and | ||
| ( | ||
| // a problematic pattern: `^a|b|...|x` | ||
| regex = "(?i)(\\^" + maybeGroupedStr + ")(?:\\|" + maybeGroupedStr + ")+" | ||
| or | ||
| // a problematic pattern: `a|b|...|x$` | ||
| regex = "(?i)(?:" + maybeGroupedStr + "\\|)+(" + maybeGroupedStr + "\\$)" | ||
| ) and | ||
| anchorPart = src.getPattern().regexpCapture(regex, 1) and | ||
| anchorPart.regexpMatch("(?i).*[a-z].*") and | ||
| msg = "Misleading operator precedence. The subexpression '" + anchorPart + "' is anchored, but the other parts of this regular expression are not" | ||
| ) | ||
| } | ||
|
|
||
| /** | ||
| * Holds if `src` is an unanchored pattern for a URL, indicating a | ||
| * mistake explained by `msg`. | ||
| */ | ||
| predicate isInterestingUnanchoredRegExpString(RegExpPatternSource src, string msg) { | ||
| exists(string pattern | pattern = src.getPattern() | | ||
| // a substring sequence of a protocol and subdomains, perhaps with some regex characters mixed in, followed by a known TLD | ||
| pattern | ||
| .regexpMatch("(?i)[():|?a-z0-9-\\\\./]+[.]" + RegExpPatterns::commonTLD() + | ||
| "([/#?():]\\S*)?") and | ||
| // without any anchors | ||
| pattern.regexpMatch("[^$^]+") and | ||
| // that is not used for capture or replace | ||
| not exists(DataFlow::MethodCallNode mcn, string name | name = mcn.getMethodName() | | ||
| name = "exec" and | ||
| mcn = src.getARegExpObject().getAMethodCall() and | ||
| exists(mcn.getAPropertyRead()) | ||
| or | ||
| exists(DataFlow::Node arg | | ||
| arg = mcn.getArgument(0) and | ||
| ( | ||
| src.getARegExpObject().flowsTo(arg) or | ||
| src.getAParse() = arg | ||
| ) | ||
| | | ||
| name = "replace" | ||
| or | ||
| name = "match" and exists(mcn.getAPropertyRead()) | ||
| ) | ||
| ) and | ||
| msg = "When this is used as a regular expression on a URL, it may match anywhere, and arbitrary hosts may come before or after it." | ||
| ) | ||
| } | ||
|
|
||
| from DataFlow::Node nd, string msg | ||
| where | ||
| isInterestingUnanchoredRegExpString(nd, msg) | ||
| or | ||
| isInterestingSemiAnchoredRegExpString(nd, msg) | ||
| select nd, msg |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7 changes: 7 additions & 0 deletions
7
javascript/ql/src/Security/CWE-020/examples/MissingRegExpAnchor_BAD.js
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| app.get("/some/path", function(req, res) { | ||
| let url = req.param("url"); | ||
| // BAD: the host of `url` may be controlled by an attacker | ||
| if (url.match(/https?:\/\/www\.example\.com\//)) { | ||
| res.redirect(url); | ||
| } | ||
| }); |
7 changes: 7 additions & 0 deletions
7
javascript/ql/src/Security/CWE-020/examples/MissingRegExpAnchor_GOOD.js
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| app.get("/some/path", function(req, res) { | ||
| let url = req.param("url"); | ||
| // GOOD: the host of `url` can not be controlled by an attacker | ||
| if (url.match(/^https?:\/\/www\.example\.com\//)) { | ||
| res.redirect(url); | ||
| } | ||
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t = t2.continue()would be slightly more correct, I believe (though I doubt there is any practical difference).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unchanged move from #1211.
The tests in that PR shows a semantic difference.
The following
NOT OKline is not flagged ift = t2.continue()is used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, forgot about this one.