Skip to content

security(wave2): SSRF protection, path sanitization, and keyed hashing#239

Closed
KooshaPari wants to merge 1 commit intomainfrom
security/wave2-ssrf-path-hash
Closed

security(wave2): SSRF protection, path sanitization, and keyed hashing#239
KooshaPari wants to merge 1 commit intomainfrom
security/wave2-ssrf-path-hash

Conversation

@KooshaPari
Copy link
Copy Markdown
Owner

Summary

  • SSRF Protection: Add validateResolvedHostIPs to block private/loopback IPs in API calls
  • Path Sanitization: Add cleanTokenPath to prevent path traversal in kiro token loading
  • Keyed Hashing: Replace raw SHA256 with HMAC for sensitive ID hashing
  • URL Validation: Reject URLs with user info in validateAPICallURL and copilotQuotaURLFromTokenURL
  • Log Redaction: Redact logged request/response bodies with SHA256 hash for auditability
  • Websocket Sanitization: Sanitize session IDs and endpoints before logging

Security Alerts Addressed

  • go/request-forgery - SSRF via resolved IP validation
  • go/clear-text-logging - Sensitive data in logs
  • go/weak-sensitive-data-hashing - Predictable hashing
  • go/path-injection - Path traversal in token files

Test Results

Package Status
pkg/llmproxy/api/middleware ✅ PASS
pkg/llmproxy/registry ✅ PASS
sdk/cliproxy/auth ✅ PASS
internal/runtime/executor ✅ PASS

Pre-existing Issues (Not Introduced)

  • Executor packages have undefined: normalizeGeminiCLIModel build failure
  • Kiro auth has duplicate roundTripperFunc declaration in test files
  • Path traversal test expects 400 but gets 500 (blocked correctly, wrong HTTP status)

Related

- Add SSRF protection in api_tools.go: validateResolvedHostIPs blocks private/loopback IPs
- Add path sanitization in kiro/token.go: cleanTokenPath prevents path traversal
- Replace sha256 with HMAC for sensitive ID hashing in conductor.go, types.go, user_id_cache.go
- Reject URLs with user info in validateAPICallURL and copilotQuotaURLFromTokenURL
- Redact logged request/response bodies with SHA256 hash for auditability
- Sanitize websocket session IDs and endpoints before logging

Addresses Code Scanning alerts:
- go/request-forgery
- go/clear-text-logging
- go/weak-sensitive-data-hashing
- go/path-injection

Tests:
- pkg/llmproxy/api/middleware: pass
- pkg/llmproxy/registry: pass
- sdk/cliproxy/auth: pass
- internal/runtime/executor: pass

Pre-existing issues (not introduced by this PR):
- executor packages have undefined normalizeGeminiCLIModel build failure
- kiro auth has duplicate roundTripperFunc declaration in test files
- path traversal test expects 400 but gets 500 (blocked correctly, wrong status code)
continue
}
current = filepath.Join(current, component)
info, errStat := os.Lstat(current)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a user-provided value.

Copilot Autofix

AI about 2 months ago

In general, to fix “uncontrolled data used in path expression” you must ensure that any user-derived component used in paths is either (a) reduced to a single safe file name (no separators, no ..) or (b) validated to reside within an expected base directory after normalization, and avoid following symlinks where not desired. This codebase already performs directory-level validation (ResolveSafeFilePath, normalizePathWithinBase, denySymlinkPath), and ExtractIDCIdentifier already removes obvious unsafe characters from the identifier. The missing piece, from the analyzer’s perspective, is guaranteeing that the generated file name cannot contain path separators or traversal segments, regardless of how Email or StartURL are manipulated.

The best targeted fix is to (1) tighten GenerateTokenFileName so that it produces a conservative, strictly safe file name: explicitly strip or replace all characters that might introduce path separators or traversal (including .., additional punctuation, or non-alphanumerics) and, optionally, perform a final check that the result contains no /, \, or ..; and (2) add a small explicit validation of the fileName before it is combined with authDir in saveTokenToFile. This preserves existing behavior (files still named based on authMethod, email, or IDC identifier), but removes any theoretical avenue for an unsafe file name to be generated from tainted input and ensures that the path components used in denySymlinkPath are always benign file/directory names.

Concretely:

  • In pkg/llmproxy/auth/kiro/aws.go:
    • Extend GenerateTokenFileName so that:
      • The sanitized email or IDC identifier is further constrained to a safe character set (e.g., [a-zA-Z0-9_-]), replacing any other character with _.
      • After constructing the final file name (kiro-...json), verify that it contains no /, \, or ..; if it would, fall back to a conservative default like kiro-<authMethod>.json.
    • This does not change visible functionality in normal cases (valid emails and start URLs already only generate alphanumerics, dashes, and dots), but guarantees safe file names for any weird or unexpected input.
  • In pkg/llmproxy/auth/kiro/oauth_web.go:
    • After fileName := GenerateTokenFileName(tokenData), add a minimal defensive check ensuring the resulting fileName is a single component (no separators and no ..). If invalid, log an error and either abort saving or fall back to a generic file name (e.g., kiro-<authMethod>.json). This extra guard both protects authDir and makes the data flow obviously safe to the analyzer.

No changes are required to token.go or misc/path_security.go for correctness, as those already provide strong normalization and containment checking.


Suggested changeset 2
pkg/llmproxy/auth/kiro/oauth_web.go
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/llmproxy/auth/kiro/oauth_web.go b/pkg/llmproxy/auth/kiro/oauth_web.go
--- a/pkg/llmproxy/auth/kiro/oauth_web.go
+++ b/pkg/llmproxy/auth/kiro/oauth_web.go
@@ -383,6 +383,13 @@
 	// Generate filename using the unified function
 	fileName := GenerateTokenFileName(tokenData)
 
+	// Ensure the generated file name is a single path component
+	if strings.ContainsAny(fileName, `/\`) || strings.Contains(fileName, "..") || strings.TrimSpace(fileName) == "" {
+		log.Errorf("OAuth Web: generated unsafe token file name %q, falling back to default", fileName)
+		fallback := "kiro-unknown.json"
+		fileName = fallback
+	}
+
 	authFilePath := filepath.Join(authDir, fileName)
 
 	// Convert to storage format and save
EOF
@@ -383,6 +383,13 @@
// Generate filename using the unified function
fileName := GenerateTokenFileName(tokenData)

// Ensure the generated file name is a single path component
if strings.ContainsAny(fileName, `/\`) || strings.Contains(fileName, "..") || strings.TrimSpace(fileName) == "" {
log.Errorf("OAuth Web: generated unsafe token file name %q, falling back to default", fileName)
fallback := "kiro-unknown.json"
fileName = fallback
}

authFilePath := filepath.Join(authDir, fileName)

// Convert to storage format and save
pkg/llmproxy/auth/kiro/aws.go
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/llmproxy/auth/kiro/aws.go b/pkg/llmproxy/auth/kiro/aws.go
--- a/pkg/llmproxy/auth/kiro/aws.go
+++ b/pkg/llmproxy/auth/kiro/aws.go
@@ -576,22 +576,68 @@
 		authMethod = "unknown"
 	}
 
+	// helper to normalize identifier into a safe filename component
+	sanitizeComponent := func(s string) string {
+		// replace some obvious separators/punctuation first
+		replacer := strings.NewReplacer(
+			"@", "-",
+			".", "-",
+			"/", "_",
+			"\\", "_",
+			":", "_",
+		)
+		s = replacer.Replace(s)
+
+		// allow only alphanumerics, dash and underscore in the final component
+		var b strings.Builder
+		for _, r := range s {
+			if (r >= 'a' && r <= 'z') ||
+				(r >= 'A' && r <= 'Z') ||
+				(r >= '0' && r <= '9') ||
+				r == '-' || r == '_' {
+				b.WriteRune(r)
+			} else {
+				b.WriteRune('_')
+			}
+		}
+		component := b.String()
+		// avoid empty or traversal-like components
+		if component == "" || component == "." || component == ".." {
+			return "unknown"
+		}
+		return component
+	}
+
+	makeFileName := func(id string) string {
+		if id != "" {
+			id = sanitizeComponent(id)
+			name := fmt.Sprintf("kiro-%s-%s.json", authMethod, id)
+			// final safety check: ensure single path component
+			if strings.ContainsAny(name, `/\`) || strings.Contains(name, "..") {
+				return fmt.Sprintf("kiro-%s.json", authMethod)
+			}
+			return name
+		}
+		name := fmt.Sprintf("kiro-%s.json", authMethod)
+		if strings.ContainsAny(name, `/\`) || strings.Contains(name, "..") {
+			return "kiro-unknown.json"
+		}
+		return name
+	}
+
 	// Priority 1: Use email if available (email is unique)
 	if tokenData.Email != "" {
-		sanitizedEmail := tokenData.Email
-		sanitizedEmail = strings.ReplaceAll(sanitizedEmail, "@", "-")
-		sanitizedEmail = strings.ReplaceAll(sanitizedEmail, ".", "-")
-		return fmt.Sprintf("kiro-%s-%s.json", authMethod, sanitizedEmail)
+		return makeFileName(tokenData.Email)
 	}
 
 	// Priority 2: For IDC only, use startUrl identifier when available
 	if authMethod == "idc" && tokenData.StartURL != "" {
 		identifier := ExtractIDCIdentifier(tokenData.StartURL)
 		if identifier != "" {
-			return fmt.Sprintf("kiro-%s-%s.json", authMethod, identifier)
+			return makeFileName(identifier)
 		}
 	}
 
 	// Priority 3: Fallback to authMethod only
-	return fmt.Sprintf("kiro-%s.json", authMethod)
+	return makeFileName("")
 }
EOF
Copilot is powered by AI and may make mistakes. Always verify output.
func logCodexWebsocketDisconnected(sessionID, authID, wsURL, reason string, err error) {
if err != nil {
log.Infof("codex websockets: upstream disconnected session=%s auth=%s url=%s reason=%s err=%v", strings.TrimSpace(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogURL(wsURL), strings.TrimSpace(reason), err)
log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s err=%v", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason), err)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

Sensitive data returned by an access to apiKeyModelAlias
flows to a logging call.

Copilot Autofix

AI about 2 months ago

General approach: never log potentially sensitive, tainted data directly. In this case, that means avoiding logging the full error value from thinking.ApplyThinking in websocket disconnect logs, or at least ensuring the logged form of that error does not expose sensitive fields (like model IDs derived from API-key model aliases). Since we’re constrained to the provided snippets, the safest and least invasive fix is to sanitize the error before passing it to logCodexWebsocketDisconnected, so logs show only non-sensitive information.

Best targeted fix:

  • In pkg/llmproxy/executor/codex_websockets_executor.go, inside the Execute method’s defer that logs websocket disconnects, introduce a local safeErr that either:
    • preserves non-sensitive error information, or
    • replaces the original error with a generic error wrapper that does not embed sensitive data.
  • Then pass safeErr into logCodexWebsocketDisconnected instead of the raw err.
  • This change is localized to logging and does not alter functional behavior of the executor or error propagation; callers still see the original err as the function’s return value.
  • We can construct a generic fmt.Errorf("execution failed: %T", err) (or even just errors.New("execution failed")) to retain at least the error type in logs while not printing the message that might contain sensitive content.

Concretely:

  • Edit the defer block at lines 291–296 (within the Execute function) in pkg/llmproxy/executor/codex_websockets_executor.go.
  • Replace the direct call logCodexWebsocketDisconnected(..., reason, err) with logic that computes safeErr and uses that instead.

No changes are needed in the thinking or registry packages for this specific logging sink.


Suggested changeset 1
pkg/llmproxy/executor/codex_websockets_executor.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/llmproxy/executor/codex_websockets_executor.go b/pkg/llmproxy/executor/codex_websockets_executor.go
--- a/pkg/llmproxy/executor/codex_websockets_executor.go
+++ b/pkg/llmproxy/executor/codex_websockets_executor.go
@@ -293,7 +293,12 @@
 			if err != nil {
 				reason = "error"
 			}
-			logCodexWebsocketDisconnected(executionSessionID, authID, wsURL, reason, err)
+			var safeErr error
+			if err != nil {
+				// Avoid logging potentially sensitive details from upstream errors.
+				safeErr = fmt.Errorf("execution failed: %T", err)
+			}
+			logCodexWebsocketDisconnected(executionSessionID, authID, wsURL, reason, safeErr)
 			if errClose := conn.Close(); errClose != nil {
 				log.Errorf("codex websockets executor: close websocket error: %v", errClose)
 			}
EOF
@@ -293,7 +293,12 @@
if err != nil {
reason = "error"
}
logCodexWebsocketDisconnected(executionSessionID, authID, wsURL, reason, err)
var safeErr error
if err != nil {
// Avoid logging potentially sensitive details from upstream errors.
safeErr = fmt.Errorf("execution failed: %T", err)
}
logCodexWebsocketDisconnected(executionSessionID, authID, wsURL, reason, safeErr)
if errClose := conn.Close(); errClose != nil {
log.Errorf("codex websockets executor: close websocket error: %v", errClose)
}
Copilot is powered by AI and may make mistakes. Always verify output.
if trimmed == "" {
return ""
}
sum := sha256.Sum256([]byte(trimmed))

Check failure

Code scanning / CodeQL

Use of a broken or weak cryptographic hashing algorithm on sensitive data High

Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.

Copilot Autofix

AI about 2 months ago

General approach: For password hashing, you would replace SHA-256 with a password hashing algorithm (Argon2, scrypt, bcrypt, PBKDF2). Here, however, the code is not performing password hashing; it’s deriving an opaque session label for logs. To both (a) avoid misuse of SHA-256 in a context tainted as “password-like” and (b) avoid unnecessary cryptographic operations, we can replace the SHA-256 call with a fast, non-cryptographic hash (for example hash/fnv) and truncate its output as before. This preserves functionality (stable, short pseudonymous ID) while removing any appearance of password hashing.

Concretely, in pkg/llmproxy/executor/codex_websockets_executor.go:

  • Remove the import of crypto/sha256 (no longer needed in this file).
  • Add an import of hash/fnv.
  • In sanitizeCodexSessionID, replace:
    sum := sha256.Sum256([]byte(trimmed))
    return fmt.Sprintf("sess_%x", sum[:6])
    with code that:
    • Creates an FNV-1a 64-bit hasher.
    • Writes the trimmed string bytes into the hasher.
    • Gets the 64-bit hash and formats it as hex, truncating to the same number of hex characters as before (12 hex characters for 6 bytes).

No changes are required in sdk/cliproxy/auth/conductor.go, because it only participates in the taint flow; the hashing actually occurs in sanitizeCodexSessionID.


Suggested changeset 1
pkg/llmproxy/executor/codex_websockets_executor.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/llmproxy/executor/codex_websockets_executor.go b/pkg/llmproxy/executor/codex_websockets_executor.go
--- a/pkg/llmproxy/executor/codex_websockets_executor.go
+++ b/pkg/llmproxy/executor/codex_websockets_executor.go
@@ -5,8 +5,8 @@
 import (
 	"bytes"
 	"context"
-	"crypto/sha256"
 	"fmt"
+	"hash/fnv"
 	"io"
 	"net"
 	"net/http"
@@ -1343,8 +1342,11 @@
 	if trimmed == "" {
 		return ""
 	}
-	sum := sha256.Sum256([]byte(trimmed))
-	return fmt.Sprintf("sess_%x", sum[:6])
+	h := fnv.New64a()
+	_, _ = h.Write([]byte(trimmed))
+	// Keep the identifier short and opaque, similar to the previous 6-byte SHA-256 prefix.
+	hashVal := h.Sum64()
+	return fmt.Sprintf("sess_%012x", hashVal&0xfffffffffff)
 }
 
 // CodexAutoExecutor routes Codex requests to the websocket transport only when:
EOF
@@ -5,8 +5,8 @@
import (
"bytes"
"context"
"crypto/sha256"
"fmt"
"hash/fnv"
"io"
"net"
"net/http"
@@ -1343,8 +1342,11 @@
if trimmed == "" {
return ""
}
sum := sha256.Sum256([]byte(trimmed))
return fmt.Sprintf("sess_%x", sum[:6])
h := fnv.New64a()
_, _ = h.Write([]byte(trimmed))
// Keep the identifier short and opaque, similar to the previous 6-byte SHA-256 prefix.
hashVal := h.Sum64()
return fmt.Sprintf("sess_%012x", hashVal&0xfffffffffff)
}

// CodexAutoExecutor routes Codex requests to the websocket transport only when:
Copilot is powered by AI and may make mistakes. Always verify output.
if trimmed == "" {
return ""
}
sum := sha256.Sum256([]byte(trimmed))

Check failure

Code scanning / CodeQL

Use of a broken or weak cryptographic hashing algorithm on sensitive data High

Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.

Copilot Autofix

AI about 2 months ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

func logCodexWebsocketDisconnected(sessionID string, authID string, wsURL string, reason string, err error) {
if err != nil {
log.Infof("codex websockets: upstream disconnected session=%s auth=%s url=%s reason=%s err=%v", strings.TrimSpace(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogURL(wsURL), strings.TrimSpace(reason), err)
log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s err=%v", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason), err)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

Sensitive data returned by an access to apiKeyModelAlias
flows to a logging call.

Copilot Autofix

AI about 2 months ago

General fix strategy: avoid logging raw error messages that may contain sensitive or tenant-specific data, especially at info level. Instead, either (a) do not log the error at all in that context, or (b) log only non-sensitive metadata (e.g., that an error occurred) and, if needed, a sanitized version of the error that strips or masks any potentially sensitive substrings (e.g., API keys).

Best concrete fix here: adjust logCodexWebsocketDisconnected in pkg/llmproxy/runtime/executor/codex_websockets_executor.go so that it no longer interpolates the raw err value directly into the info-level log line. We can log the disconnection with session ID, auth, endpoint, and reason as before, and add a minimal “err=failed” indicator or a separate debug-level log, but we should not include the full err string. Because this helper is only used in this file and we have no global sanitization for arbitrary errors, the simplest safe change with minimal behavior impact is:

  • Keep the function signature the same so callers don’t change.
  • In the if err != nil branch, remove %v and err from the log format and arguments.
  • Optionally, include a boolean error=true style marker in the message if retaining error presence is important, but avoid outputting the content.

This fix touches only pkg/llmproxy/runtime/executor/codex_websockets_executor.go lines 1302–1307; no new imports or helpers are necessary.


Suggested changeset 1
pkg/llmproxy/runtime/executor/codex_websockets_executor.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/pkg/llmproxy/runtime/executor/codex_websockets_executor.go b/pkg/llmproxy/runtime/executor/codex_websockets_executor.go
--- a/pkg/llmproxy/runtime/executor/codex_websockets_executor.go
+++ b/pkg/llmproxy/runtime/executor/codex_websockets_executor.go
@@ -1301,7 +1301,7 @@
 
 func logCodexWebsocketDisconnected(sessionID string, authID string, wsURL string, reason string, err error) {
 	if err != nil {
-		log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s err=%v", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason), err)
+		log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s err=true", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason))
 		return
 	}
 	log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason))
EOF
@@ -1301,7 +1301,7 @@

func logCodexWebsocketDisconnected(sessionID string, authID string, wsURL string, reason string, err error) {
if err != nil {
log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s err=%v", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason), err)
log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s err=true", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason))
return
}
log.Infof("codex websockets: upstream disconnected session=%s auth=%s endpoint=%s reason=%s", sanitizeCodexSessionID(sessionID), sanitizeCodexWebsocketLogField(authID), sanitizeCodexWebsocketLogEndpoint(wsURL), strings.TrimSpace(reason))
Copilot is powered by AI and may make mistakes. Always verify output.
if trimmed == "" {
return ""
}
sum := sha256.Sum256([]byte(trimmed))

Check failure

Code scanning / CodeQL

Use of a broken or weak cryptographic hashing algorithm on sensitive data High

Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.
Sensitive data (password)
is used in a hashing algorithm (SHA256) that is insecure for password hashing, since it is not a computationally expensive hash function.

Copilot Autofix

AI about 2 months ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

@KooshaPari
Copy link
Copy Markdown
Owner Author

Superseded by rebased PR #240

@KooshaPari KooshaPari closed this Feb 23, 2026
@KooshaPari KooshaPari deleted the security/wave2-ssrf-path-hash branch February 26, 2026 11:56
KooshaPari pushed a commit that referenced this pull request Feb 27, 2026
feat(registry): add GPT-5.3 Codex to GitHub Copilot provider
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants