Add Java path traversal detection for archive and derived paths#5591
Add Java path traversal detection for archive and derived paths#5591FORIMOC wants to merge 1 commit intoSonarSource:masterfrom
Conversation
SummaryThis PR adds a new path traversal detection rule (S7099) for Java code that targets three common vulnerability patterns:
The implementation ( The PR includes full test coverage for 4 real CVEs (CVE-2022-4494, CVE-2022-39367, CVE-2022-31194, CVE-2022-29253) and rule documentation. What reviewers should knowStarting points for review:
Key design choices to understand:
Potential concerns to verify:
Test validation:
|
There was a problem hiding this comment.
The rule covers the right vulnerability patterns and the CVE-based test cases demonstrate the targeted scenarios well. However, there are several correctness bugs that would cause real issues in production scans: taint state is corrupted by nested method/lambda visits, sanitization calls are not recognized (any method accepting a tainted arg is itself treated as tainted), and two of the three source types have no type checking, making them prone to false positives at scale.
| } | ||
|
|
||
| private void checkMethod(MethodTree tree) { | ||
| taintedIdentifiers.clear(); |
There was a problem hiding this comment.
State corruption with nested methods and lambdas. checkMethod() clears taintedIdentifiers every time the visitor enters a METHOD node. Because the AST is traversed top-down and all subscribed nodes are dispatched via visitNode(), a nested method inside the outer method — an anonymous class method, a lambda body, or a local class — fires checkMethod() mid-traversal of the outer method, wiping the outer taint context. Everything in the outer method that appears after the nested construct is then analysed with an empty taintedIdentifiers.
Example pattern this silently misses:
void unzip(ZipEntry ze, File dir) {
String name = ze.getName(); // tainted
list.forEach(item -> {}); // anonymous method fires checkMethod() → taintedIdentifiers cleared
File f = new File(dir, name); // name is no longer considered tainted — missed!
}The conventional fix is to use a stack: push the current taint set when entering a method and pop it when leaving via leaveNode(Tree.Kind.METHOD). Subscribe to METHOD in leaveNode() as well and restore the previous frame on exit.
- Mark as noise
| if (isArchiveEntryGetName(mit) || isRequestGetParameter(mit)) { | ||
| return true; | ||
| } | ||
| return mit.arguments().stream().anyMatch(this::isTainted); |
There was a problem hiding this comment.
Taint propagates through sanitization methods, causing false positives. mit.arguments().stream().anyMatch(this::isTainted) marks the return value of any method tainted if any of its arguments is tainted. This means a legitimate sanitization or normalization call is treated as tainted output:
String safe = FilenameUtils.getName(zipEntry.getName()); // sanitizes, strips path separators
File f = new File(destDir, safe); // Noncompliant — false positiveThis heuristic will fire for any helper method, logger, or utility that happens to receive a tainted value. It should be removed. Taint should only propagate through explicit data-flow nodes (identifier reads, binary concatenation), not through opaque method calls.
- Mark as noise
| if (!(mit.methodSelect() instanceof MemberSelectExpressionTree mse)) { | ||
| return false; | ||
| } | ||
| return "getParameter".equals(mse.identifier().name()); |
There was a problem hiding this comment.
isRequestGetParameter() checks only the method name, not the receiver type. Any class with a getParameter method — a Servlet filter wrapper, a custom query object, a test double — is treated as a taint source. This produces false positives across large codebases where the name is common.
Use MethodMatchers (already used by other checks in this codebase, e.g. SQLInjectionCheck) to constrain the match to javax.servlet.ServletRequest / jakarta.servlet.ServletRequest subtypes:
private static final MethodMatchers REQUEST_GET_PARAMETER = MethodMatchers.create()
.ofSubTypes("javax.servlet.ServletRequest", "jakarta.servlet.ServletRequest")
.names("getParameter")
.withAnyParameters()
.build();- Mark as noise
|
|
||
| private static boolean isPathLikeName(String name) { | ||
| name = name.toLowerCase(); | ||
| return PATH_LIKE_PARAMETER_NAMES.stream().anyMatch(name::contains); |
There was a problem hiding this comment.
isPathLikeName() uses substring containment against a fixed word list. Several common parameter names match unintentionally:
profile→ contains"file"dirtyFlag,directoryType→ contain"dir"urlValidator,handlerUrl→ contain"url"resourceType,templateProcessor→ contain"resource"/"template"
In any real enterprise codebase, this will generate a significant volume of false positives for parameters that have nothing to do with file paths. Either switch to whole-word matching (compare after splitting on camelCase boundaries) or tighten the list to fewer, more precise tokens like "path", "filename", and "filepath".
- Mark as noise
| reportIssue(mse.identifier(), MESSAGE); | ||
| return; | ||
| } | ||
| if ("getResource".equals(methodName) && !tree.arguments().isEmpty() && isTainted(tree.arguments().get(0))) { |
There was a problem hiding this comment.
getResource is matched by method name only, with no check that the receiver is a ClassLoader. Any object — a Spring ResourceLoader, a custom class — with a getResource method and a tainted argument would be flagged. Use MethodMatchers.create().ofSubTypes("java.lang.ClassLoader").names("getResource")... to restrict the match.
- Mark as noise
| if (expr == null) { | ||
| return false; | ||
| } | ||
| if (isArchiveEntryGetName(expr) || isRequestGetParameter(expr)) { |
There was a problem hiding this comment.
Logic duplication: isArchiveEntryGetName(expr) and isRequestGetParameter(expr) are called at line 128 for any ExpressionTree, and then called again at lines 138–139 inside the expr instanceof MethodInvocationTree branch. Since a MethodInvocationTree passes through the line-128 guard first, the checks at lines 138–139 are unreachable dead code. Remove lines 138–139 and the surrounding if:
if (expr instanceof MethodInvocationTree mit) {
return mit.arguments().stream().anyMatch(this::isTainted);
}(Note: the argument-propagation line itself should also be reconsidered per the separate comment above.)
| if (isArchiveEntryGetName(expr) || isRequestGetParameter(expr)) { | |
| if (isArchiveEntryGetName(expr) || isRequestGetParameter(expr)) { | |
| return true; | |
| } |
- Mark as noise
| @@ -0,0 +1,28 @@ | |||
| { | |||
| "title": "Archive entry names should be validated before being used in file paths", | |||
There was a problem hiding this comment.
The title says "Archive entry names should be validated before being used in file paths" but the rule also covers request parameters (getParameter) and ClassLoader.getResource lookups. A reviewer scanning the rule list would not expect those patterns from this title. Align the title with the actual scope, e.g. "Attacker-controlled path components should be validated before use in file-system or resource operations".
| "title": "Archive entry names should be validated before being used in file paths", | |
| "title": "Attacker-controlled path components should be validated before use in file-system or resource operations", |
- Mark as noise
| FileOutputStream fos = new FileOutputStream(newFile); | ||
| } | ||
|
|
||
| void good(File destDir) throws Exception { |
There was a problem hiding this comment.
The good() method tests the safe case, but there is no // Compliant annotation and the test class does not call verifyNoIssues() on a separate file (or verifyIssues() with only the bad case). Looking at the test in ArchiveEntryPathTraversalCheckTest, all five test methods call verifyIssues() — which asserts that every // Noncompliant marker matches a reported issue AND that no additional issues exist in the file. So the good() path is implicitly validated (an unexpected issue there would fail the test). This is fine, but it relies on verifyIssues() semantics rather than an explicit verifyNoIssues() call. Add a // Compliant comment to make the intent self-documenting.
| void good(File destDir) throws Exception { | |
| void good(File destDir) throws Exception { | |
| String fileName = "safe.txt"; // Compliant | |
| File newFile = new File(destDir, fileName); | |
| FileOutputStream fos = new FileOutputStream(newFile); | |
| } |
- Mark as noise
Add a Java path traversal rule for archive-derived, request-derived, and resource-lookup paths
Summary
This change adds a new Java security rule to improve path traversal detection in three representative situations:
ZipEntry.getName()orJarEntry.getName()ClassLoader.getResource(...)The new rule is implemented as
ArchiveEntryPathTraversalCheckand is registered with rule keyS7099.Motivation
Several real vulnerability patterns share the same core shape: attacker-controlled path components are propagated into file-system paths or resource lookups without sufficient validation.
Representative examples include:
CVE-2022-4494CVE-2022-39367CVE-2022-31194CVE-2022-29253One representative archive-entry case looks like this:
A representative request-derived case looks like this:
A representative resource-lookup case looks like this:
What this rule detects
The rule reports when untrusted path components reach sensitive path construction or resource lookup operations.
It currently models:
ZipEntry.getName()andJarEntry.getName()as archive-derived path sourcesgetParameter(...)as a request-derived path sourcepath,file,filename,dir,template, orresourcenew File(...)mkdir(),mkdirs(),createNewFile()ClassLoader.getResource(...)Validation
The rule has been integrated into the real
sonar-javasource tree and validated with targeted tests.The following representative cases are detected by the new rule:
CVE-2022-4494CVE-2022-39367CVE-2022-31194CVE-2022-29253The rule is also included in the generated rule list after running the normal
GeneratedCheckListbuild step.Scope and design choice
This implementation is intentionally conservative. It is not a full general-purpose taint engine. Instead, it focuses on a compact set of source, propagation, and sink shapes that repeatedly appear in real Java path traversal vulnerabilities.
The goal of this change is to improve coverage for important and recurring path traversal patterns while keeping the implementation understandable and maintainable.