This follows on from #34 and #45, which document the path traversal vulnerability across case and download routes. Both issues agree on what needs to happen — validate that any user-supplied path stays inside DATA_STORAGE before using it. This issue proposes a concrete implementation that fixes the vulnerability and is also recognised by static analysis tools like CodeQL.
The problem with validation-only approaches
Adding a validation check like validate_path() before a Path() call protects the app at runtime, but static analysis tools (CodeQL, Bandit) still flag the code. They track user-supplied data from source to sink — if request.args.get('case') reaches a Path() or send_file() call, it's flagged, regardless of what runs between them. The tool can't know your custom function cleaned the input.
The fix: case_path() using werkzeug.utils.safe_join
werkzeug is already a Flask dependency. CodeQL explicitly models werkzeug.utils.safe_join as a path sanitizer — its return value is considered clean. Combined with a containment check for symlink safety, this gives both runtime protection and static analysis compliance.
Add to Config.py:
from werkzeug.utils import safe_join
def case_path(*components):
"""Build a validated path inside DATA_STORAGE.
Uses werkzeug.utils.safe_join (recognised by CodeQL as a sanitizer)
plus a realpath containment check to catch symlinks. Raises
PermissionError if any component escapes DATA_STORAGE.
"""
safe = safe_join(str(DATA_STORAGE), *[str(c) or '' for c in components])
if safe is None:
raise PermissionError("Path Traversal Attempt Detected")
validate_path(DATA_STORAGE, safe) # catches symlinks safe_join misses
return Path(safe)
Then replace path constructions throughout the codebase:
# Before
case = request.json['casename']
casePath = Path(Config.DATA_STORAGE, case)
# After
casePath = Config.case_path(case)
# Before
dataFile = Path(Config.DATA_STORAGE, case, 'res', 'csv', file)
# After
dataFile = Config.case_path(case, 'res', 'csv', file)
What this achieves
- Path traversal is blocked at construction time — you can't get a
Path object without passing validation
- CodeQL stops flagging these locations because
safe_join's return value is a recognised clean source
- The pattern is impossible to misuse — there's no separate validate + build step to forget
None components are handled safely (converted to empty strings)
Scope
Main files to update: DataFileClass.py, DataFileRoute.py, CaseRoute.py, UploadRoute.py, OsemosysClass.py. The function itself is a small addition to Config.py.
This follows on from #34 and #45, which document the path traversal vulnerability across case and download routes. Both issues agree on what needs to happen — validate that any user-supplied path stays inside
DATA_STORAGEbefore using it. This issue proposes a concrete implementation that fixes the vulnerability and is also recognised by static analysis tools like CodeQL.The problem with validation-only approaches
Adding a validation check like
validate_path()before aPath()call protects the app at runtime, but static analysis tools (CodeQL, Bandit) still flag the code. They track user-supplied data from source to sink — ifrequest.args.get('case')reaches aPath()orsend_file()call, it's flagged, regardless of what runs between them. The tool can't know your custom function cleaned the input.The fix:
case_path()usingwerkzeug.utils.safe_joinwerkzeugis already a Flask dependency. CodeQL explicitly modelswerkzeug.utils.safe_joinas a path sanitizer — its return value is considered clean. Combined with a containment check for symlink safety, this gives both runtime protection and static analysis compliance.Add to
Config.py:Then replace path constructions throughout the codebase:
What this achieves
Pathobject without passing validationsafe_join's return value is a recognised clean sourceNonecomponents are handled safely (converted to empty strings)Scope
Main files to update:
DataFileClass.py,DataFileRoute.py,CaseRoute.py,UploadRoute.py,OsemosysClass.py. The function itself is a small addition toConfig.py.