Skip to content

Add Java path traversal detection for archive and derived paths#5591

Open
FORIMOC wants to merge 1 commit intoSonarSource:masterfrom
FORIMOC:feat/java-path-traversal-rule
Open

Add Java path traversal detection for archive and derived paths#5591
FORIMOC wants to merge 1 commit intoSonarSource:masterfrom
FORIMOC:feat/java-path-traversal-rule

Conversation

@FORIMOC
Copy link
Copy Markdown

@FORIMOC FORIMOC commented Apr 30, 2026

Add a Java path traversal rule for archive-derived, request-derived, and resource-lookup paths

Summary

This change adds a new Java security rule to improve path traversal detection in three representative situations:

  • archive-entry-derived paths, such as ZipEntry.getName() or JarEntry.getName()
  • request-derived paths that are propagated through one path-construction step
  • template or resource names that later reach ClassLoader.getResource(...)

The new rule is implemented as ArchiveEntryPathTraversalCheck and is registered with rule key S7099.

Motivation

Several real vulnerability patterns share the same core shape: attacker-controlled path components are propagated into file-system paths or resource lookups without sufficient validation.

Representative examples include:

  • CVE-2022-4494
  • CVE-2022-39367
  • CVE-2022-31194
  • CVE-2022-29253

One representative archive-entry case looks like this:

while ((zipEntry = zipInputStream.getNextEntry()) != null) {
    final File destFile = new File(importSandboxDirectory, zipEntry.getName());
    ServiceUtilities.ensureFileCreated(destFile);
    final FileOutputStream destOutputStream = new FileOutputStream(destFile);
}

A representative request-derived case looks like this:

String resumableIdentifier = request.getParameter("resumableIdentifier");
tempDir = tempDir + File.separator + resumableIdentifier;
File fileDir = new File(tempDir);
fileDir.mkdir();

A representative resource-lookup case looks like this:

String templatePath = suffixPath + templateName;
return classloader.getResource(templatePath);

What this rule detects

The rule reports when untrusted path components reach sensitive path construction or resource lookup operations.

It currently models:

  • ZipEntry.getName() and JarEntry.getName() as archive-derived path sources
  • getParameter(...) as a request-derived path source
  • conservative path-like method parameters such as path, file, filename, dir, template, or resource
  • propagation through variable assignment and string concatenation
  • the following sinks:
    • new File(...)
    • mkdir(), mkdirs(), createNewFile()
    • ClassLoader.getResource(...)

Validation

The rule has been integrated into the real sonar-java source tree and validated with targeted tests.

The following representative cases are detected by the new rule:

  • CVE-2022-4494
  • CVE-2022-39367
  • CVE-2022-31194
  • CVE-2022-29253

The rule is also included in the generated rule list after running the normal GeneratedCheckList build step.

Scope and design choice

This implementation is intentionally conservative. It is not a full general-purpose taint engine. Instead, it focuses on a compact set of source, propagation, and sink shapes that repeatedly appear in real Java path traversal vulnerabilities.

The goal of this change is to improve coverage for important and recurring path traversal patterns while keeping the implementation understandable and maintainable.

@sonar-review-alpha
Copy link
Copy Markdown
Contributor

sonar-review-alpha Bot commented Apr 30, 2026

Summary

This PR adds a new path traversal detection rule (S7099) for Java code that targets three common vulnerability patterns:

  1. Archive extraction: untrusted paths from ZipEntry.getName() / JarEntry.getName()
  2. Request parameters: paths derived from request.getParameter(...)
  3. Resource lookups: template/resource names passed to ClassLoader.getResource(...)

The implementation (ArchiveEntryPathTraversalCheck) uses conservative taint tracking to detect when untrusted paths reach sensitive sinks like new File(), mkdir(), and getResource(). It propagates taint through variable assignments and string concatenation, and identifies suspicious path sources through explicit method names and parameter naming conventions.

The PR includes full test coverage for 4 real CVEs (CVE-2022-4494, CVE-2022-39367, CVE-2022-31194, CVE-2022-29253) and rule documentation.

What reviewers should know

Starting points for review:

  • Core logic: ArchiveEntryPathTraversalCheck.java:isTainted() and the source detection methods (isArchiveEntryGetName, isRequestGetParameter)—these define what counts as "untrusted"
  • Sink detection: The checkNewClass() and checkMethodInvocation() methods catch vulnerable usage patterns

Key design choices to understand:

  • Source detection for request parameters relies only on method name getParameter without type checking the receiver—this is intentionally loose to handle custom Request types, but may flag false positives
  • Path-like parameter identification uses substring matching (contains "path", "file", "dir", etc. after lowercasing)—simple but catches common patterns
  • Taint propagation is method-level and clears between methods (setContext())—no cross-method tracking
  • Binary expression handling treats any concatenation with tainted data as tainted—conservative approach

Potential concerns to verify:

  • Whether the parameter name matching (isPathLikeName) is too broad and causes false positives (e.g., "templates" contains "template")
  • Whether ignoring type information on getParameter is acceptable for coverage vs. false positive trade-off
  • Whether taint propagation through nested method calls is sufficient (the code checks arguments recursively, but doesn't track return values)
  • Whether FileOutputStream/FileInputStream constructors that accept File objects (not strings) should be additional sinks

Test validation:

  • All CVE test samples correctly mark the vulnerable line with // Noncompliant
  • Generic sample shows both a vulnerable pattern and a safe alternative

  • Generate Walkthrough
  • Generate Diagram

🗣️ Give feedback

Copy link
Copy Markdown
Contributor

@sonar-review-alpha sonar-review-alpha Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule covers the right vulnerability patterns and the CVE-based test cases demonstrate the targeted scenarios well. However, there are several correctness bugs that would cause real issues in production scans: taint state is corrupted by nested method/lambda visits, sanitization calls are not recognized (any method accepting a tainted arg is itself treated as tainted), and two of the three source types have no type checking, making them prone to false positives at scale.

🗣️ Give feedback

}

private void checkMethod(MethodTree tree) {
taintedIdentifiers.clear();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State corruption with nested methods and lambdas. checkMethod() clears taintedIdentifiers every time the visitor enters a METHOD node. Because the AST is traversed top-down and all subscribed nodes are dispatched via visitNode(), a nested method inside the outer method — an anonymous class method, a lambda body, or a local class — fires checkMethod() mid-traversal of the outer method, wiping the outer taint context. Everything in the outer method that appears after the nested construct is then analysed with an empty taintedIdentifiers.

Example pattern this silently misses:

void unzip(ZipEntry ze, File dir) {
    String name = ze.getName(); // tainted
    list.forEach(item -> {});    // anonymous method fires checkMethod() → taintedIdentifiers cleared
    File f = new File(dir, name); // name is no longer considered tainted — missed!
}

The conventional fix is to use a stack: push the current taint set when entering a method and pop it when leaving via leaveNode(Tree.Kind.METHOD). Subscribe to METHOD in leaveNode() as well and restore the previous frame on exit.

  • Mark as noise

if (isArchiveEntryGetName(mit) || isRequestGetParameter(mit)) {
return true;
}
return mit.arguments().stream().anyMatch(this::isTainted);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taint propagates through sanitization methods, causing false positives. mit.arguments().stream().anyMatch(this::isTainted) marks the return value of any method tainted if any of its arguments is tainted. This means a legitimate sanitization or normalization call is treated as tainted output:

String safe = FilenameUtils.getName(zipEntry.getName()); // sanitizes, strips path separators
File f = new File(destDir, safe); // Noncompliant — false positive

This heuristic will fire for any helper method, logger, or utility that happens to receive a tainted value. It should be removed. Taint should only propagate through explicit data-flow nodes (identifier reads, binary concatenation), not through opaque method calls.

  • Mark as noise

if (!(mit.methodSelect() instanceof MemberSelectExpressionTree mse)) {
return false;
}
return "getParameter".equals(mse.identifier().name());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isRequestGetParameter() checks only the method name, not the receiver type. Any class with a getParameter method — a Servlet filter wrapper, a custom query object, a test double — is treated as a taint source. This produces false positives across large codebases where the name is common.

Use MethodMatchers (already used by other checks in this codebase, e.g. SQLInjectionCheck) to constrain the match to javax.servlet.ServletRequest / jakarta.servlet.ServletRequest subtypes:

private static final MethodMatchers REQUEST_GET_PARAMETER = MethodMatchers.create()
  .ofSubTypes("javax.servlet.ServletRequest", "jakarta.servlet.ServletRequest")
  .names("getParameter")
  .withAnyParameters()
  .build();
  • Mark as noise


private static boolean isPathLikeName(String name) {
name = name.toLowerCase();
return PATH_LIKE_PARAMETER_NAMES.stream().anyMatch(name::contains);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isPathLikeName() uses substring containment against a fixed word list. Several common parameter names match unintentionally:

  • profile → contains "file"
  • dirtyFlag, directoryType → contain "dir"
  • urlValidator, handlerUrl → contain "url"
  • resourceType, templateProcessor → contain "resource" / "template"

In any real enterprise codebase, this will generate a significant volume of false positives for parameters that have nothing to do with file paths. Either switch to whole-word matching (compare after splitting on camelCase boundaries) or tighten the list to fewer, more precise tokens like "path", "filename", and "filepath".

  • Mark as noise

reportIssue(mse.identifier(), MESSAGE);
return;
}
if ("getResource".equals(methodName) && !tree.arguments().isEmpty() && isTainted(tree.arguments().get(0))) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getResource is matched by method name only, with no check that the receiver is a ClassLoader. Any object — a Spring ResourceLoader, a custom class — with a getResource method and a tainted argument would be flagged. Use MethodMatchers.create().ofSubTypes("java.lang.ClassLoader").names("getResource")... to restrict the match.

  • Mark as noise

if (expr == null) {
return false;
}
if (isArchiveEntryGetName(expr) || isRequestGetParameter(expr)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic duplication: isArchiveEntryGetName(expr) and isRequestGetParameter(expr) are called at line 128 for any ExpressionTree, and then called again at lines 138–139 inside the expr instanceof MethodInvocationTree branch. Since a MethodInvocationTree passes through the line-128 guard first, the checks at lines 138–139 are unreachable dead code. Remove lines 138–139 and the surrounding if:

if (expr instanceof MethodInvocationTree mit) {
  return mit.arguments().stream().anyMatch(this::isTainted);
}

(Note: the argument-propagation line itself should also be reconsidered per the separate comment above.)

Suggested change
if (isArchiveEntryGetName(expr) || isRequestGetParameter(expr)) {
if (isArchiveEntryGetName(expr) || isRequestGetParameter(expr)) {
return true;
}
  • Mark as noise

@@ -0,0 +1,28 @@
{
"title": "Archive entry names should be validated before being used in file paths",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title says "Archive entry names should be validated before being used in file paths" but the rule also covers request parameters (getParameter) and ClassLoader.getResource lookups. A reviewer scanning the rule list would not expect those patterns from this title. Align the title with the actual scope, e.g. "Attacker-controlled path components should be validated before use in file-system or resource operations".

Suggested change
"title": "Archive entry names should be validated before being used in file paths",
"title": "Attacker-controlled path components should be validated before use in file-system or resource operations",
  • Mark as noise

FileOutputStream fos = new FileOutputStream(newFile);
}

void good(File destDir) throws Exception {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The good() method tests the safe case, but there is no // Compliant annotation and the test class does not call verifyNoIssues() on a separate file (or verifyIssues() with only the bad case). Looking at the test in ArchiveEntryPathTraversalCheckTest, all five test methods call verifyIssues() — which asserts that every // Noncompliant marker matches a reported issue AND that no additional issues exist in the file. So the good() path is implicitly validated (an unexpected issue there would fail the test). This is fine, but it relies on verifyIssues() semantics rather than an explicit verifyNoIssues() call. Add a // Compliant comment to make the intent self-documenting.

Suggested change
void good(File destDir) throws Exception {
void good(File destDir) throws Exception {
String fileName = "safe.txt"; // Compliant
File newFile = new File(destDir, fileName);
FileOutputStream fos = new FileOutputStream(newFile);
}
  • Mark as noise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants