Skip to content

fix: validate redirects in favicon fetching to prevent SSRF#8858

Merged
sriramveeraghanta merged 2 commits intopreviewfrom
fix/ssrf-favicon-redirect-validation
Apr 6, 2026
Merged

fix: validate redirects in favicon fetching to prevent SSRF#8858
sriramveeraghanta merged 2 commits intopreviewfrom
fix/ssrf-favicon-redirect-validation

Conversation

@sriramveeraghanta
Copy link
Copy Markdown
Member

@sriramveeraghanta sriramveeraghanta commented Apr 6, 2026

Summary

  • Fixes incomplete SSRF remediation in favicon fetching (GHSA-9fr2-pprw-pp9j)
  • Extracts a reusable safe_get() function that validates every redirect hop against private/internal IPs before following it
  • Replaces the previous requests.get() call in fetch_and_encode_favicon() which followed redirects blindly, allowing attackers to redirect favicon URLs to private IPs and exfiltrate internal data as base64

Test plan

  • Verify normal links with favicons still resolve correctly
  • Verify a favicon <link> tag pointing to a URL that redirects to a private IP is blocked
  • Verify the main page redirect validation still works as before
  • Verify max redirect limit is enforced for both page and favicon fetches

Summary by CodeRabbit

  • Bug Fixes

    • Strengthened URL fetching to block redirects to private/internal addresses and to surface clearer validation errors during link processing.
    • Made favicon retrieval more robust by using the validated final URL when resolving resources.
  • Tests

    • Added unit tests covering URL validation, redirect handling limits, and favicon fetch behavior.

The previous SSRF fix (GHSA-jcc6-f9v6-f7jw) only validated redirects for
the main page URL but not for the favicon fetch path. An attacker could
craft an HTML page with a favicon link that redirects to a private IP,
bypassing the IP validation and leaking internal network data as base64.

Extract a reusable `safe_get()` function that validates every redirect hop
against private/internal IPs and use it for both page and favicon fetches.

Resolves: GHSA-9fr2-pprw-pp9j

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 6, 2026 10:11
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 6, 2026

📝 Walkthrough

Walkthrough

Introduces safe_get() in apps/api/plane/bgtasks/work_item_link_task.py to centralize guarded HTTP GETs with redirect-following and IP validation; refactors existing favicon/title fetching to use it. Adds unit tests for validate_url_ip and safe_get.

Changes

Cohort / File(s) Summary
URL validation & fetch logic
apps/api/plane/bgtasks/work_item_link_task.py
Added safe_get(url, headers=None, timeout=1) that validates initial and redirected targets, follows Location redirects up to a max, raises ValueError for private/internal targets and RuntimeError on redirect limit; refactored crawl_work_item_link_title_and_favicon and fetch_and_encode_favicon to call safe_get and handle its exceptions.
Unit tests
apps/api/plane/tests/unit/bg_tasks/test_work_item_link_task.py
New tests for validate_url_ip and safe_get: verify private/internal and scheme validation, redirect-following behavior, validation on each redirect, error on private redirect, error on excessive redirects, and success when final response is non-redirect within limit.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through links with careful paws,
Chased redirects, checked every clause,
No secret nests for sneaky IPs,
One tidy hop where logic skips,
A rabbit's patch to keep web paths safe. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.46% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: validating redirects in favicon fetching to prevent SSRF vulnerabilities.
Description check ✅ Passed The description covers all required sections: includes a clear Summary, documents the bug fix type, provides detailed Test Scenarios, and references the related security issue (GHSA-9fr2-pprw-pp9j).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ssrf-favicon-redirect-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves SSRF protections in work-item link crawling by validating every redirect hop during HTML and favicon fetches.

Changes:

  • Introduces a reusable safe_get() wrapper that manually follows redirects with per-hop private/internal IP validation.
  • Updates HTML fetching in crawl_work_item_link_title_and_favicon() to use safe_get() and logs validation failures.
  • Updates favicon fetching to use safe_get() instead of a redirect-following requests.get().

- Fix off-by-one in redirect limit: only raise RuntimeError when the
  response is still a redirect after MAX_REDIRECTS hops, not when the
  final response is a successful 200
- Return final URL from safe_get() so favicon href resolution uses the
  correct origin after redirects instead of the original URL
- Add unit tests for validate_url_ip and safe_get covering private IP
  blocking, redirect-following, and redirect limit enforcement

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/api/plane/bgtasks/work_item_link_task.py`:
- Around line 98-111: The current redirect handling has an off-by-one: after
following redirects the code unconditionally raises when redirect_count >=
MAX_REDIRECTS even if the final response is non-redirect; update the check so we
only raise when we hit the limit and the response is still a redirect.
Concretely, modify the post-loop condition that raises to only raise when both
redirect_count >= MAX_REDIRECTS and response.is_redirect (referencing
redirect_count, MAX_REDIRECTS, and response in work_item_link_task.py), so valid
chains that terminate exactly on the MAX_REDIRECTS-th hop are accepted.
- Around line 136-148: The favicon resolution currently uses the pre-redirect
URL; update the call to fetch_and_encode_favicon so it uses the
final/fully-resolved page URL (response.url) as the base for urljoin instead of
the original url; i.e., after safe_get returns, pass response.url to
fetch_and_encode_favicon (and if response is None due to a caught exception,
fall back to the original url) so that link rel="icon" and the /favicon.ico
fallback are resolved against the actual final host/path.
- Around line 91-107: The current validate_url_ip() check is vulnerable to DNS
rebinding TOCTOU because requests/resolver is called again when opening sockets;
fix by making the HTTP connection use the pre-resolved IP instead of letting
requests re-resolve hostnames: resolve the hostname once (reuse
validate_url_ip() to return the vetted IP for each URL/hop), and then perform
requests by connecting to that IP while setting the original hostname in the
Host header (or using a custom TransportAdapter/ConnectionPool that calls
create_connection with the resolved IP) so the socket connect cannot re-resolve;
apply this for the initial fetch and for each redirect hop in
work_item_link_task.py (use the resolved_ip returned by validate_url_ip() when
issuing requests.get and when following redirects), or alternatively enforce
egress firewall rules outside the app if binding to IP is not possible.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e4d179a1-3884-4770-b97b-e11dfe3eed42

📥 Commits

Reviewing files that changed from the base of the PR and between a18d90d and 40e4484.

📒 Files selected for processing (1)
  • apps/api/plane/bgtasks/work_item_link_task.py

Comment on lines +91 to +107
validate_url_ip(url)

current_url = url
response = requests.get(
current_url, headers=headers, timeout=timeout, allow_redirects=False
)

redirect_count = 0
while response.is_redirect and redirect_count < MAX_REDIRECTS:
redirect_url = response.headers.get("Location")
if not redirect_url:
break
current_url = urljoin(current_url, redirect_url)
validate_url_ip(current_url)
redirect_count += 1
response = requests.get(
current_url, headers=headers, timeout=timeout, allow_redirects=False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file to understand the implementation
cat -n apps/api/plane/bgtasks/work_item_link_task.py | head -150

Repository: makeplane/plane

Length of output: 6259


🏁 Script executed:

# Let's also look for the validate_url_ip function definition
rg -n "def validate_url_ip" apps/api/plane/bgtasks/work_item_link_task.py

Repository: makeplane/plane

Length of output: 100


🏁 Script executed:

# Search for any imports related to requests or socket
rg -n "^import|^from" apps/api/plane/bgtasks/work_item_link_task.py | head -30

Repository: makeplane/plane

Length of output: 429


🏁 Script executed:

# Search for any connection pooling, session management, or DNS caching configuration
rg -n "Session|Pool|Cache|resolve|getaddrinfo|urllib3" apps/api/plane/bgtasks/work_item_link_task.py

Repository: makeplane/plane

Length of output: 250


🏁 Script executed:

# Check if requests library is configured with any session or custom resolver
rg -n "requests.Session|HTTPAdapter|resolver" apps/api/plane/

Repository: makeplane/plane

Length of output: 281


🏁 Script executed:

# Check environment setup or global request configuration
fd -type f -name "*.py" apps/api/ | xargs rg -l "requests.packages|urllib3" | head -10

Repository: makeplane/plane

Length of output: 229


🏁 Script executed:

# Check the requests library version and any DNS caching behavior in requirements or setup files
find . -name "requirements*.txt" -o -name "setup.py" -o -name "setup.cfg" -o -name "pyproject.toml" | xargs grep -l "requests" 2>/dev/null | head -5

Repository: makeplane/plane

Length of output: 41


🏁 Script executed:

# Find dependency specifications
fd -type f "requirements|setup|pyproject" | grep -E "(txt|py|toml|cfg)$"

Repository: makeplane/plane

Length of output: 229


🌐 Web query:

urllib3 requests library DNS resolution caching behavior

💡 Result:

  • urllib3 (and therefore requests) does not implement its own DNS cache. When it needs a new TCP connection, it resolves hostnames via the OS resolver by calling socket.getaddrinfo() (i.e., whatever your system/NSS/DNS configuration does). [1]
  • Any “DNS caching” you observe typically comes from connection reuse, not cached DNS answers. Both urllib3 and requests.Session() keep connection pools keyed by host/port and will reuse an existing keep-alive connection when possible—so no new DNS lookup happens if no new connection is created. [2]
  • When a new connection is created, resolution happens again via the OS resolver path (so whether results are cached/for how long is OS-dependent). [1]
  • Implication for DNS changes / load balancers: if DNS for a hostname changes, an already-open pooled connection will keep talking to the old IP until that connection is dropped/closed and a new one is created. This is a consequence of pooling/keep-alive behavior. [2]
  • To “refresh” which IP you hit: force new connections (e.g., close the requests.Session() / adapter, or disable reuse), or implement DNS caching/overrides outside of requests/urllib3 (OS-level caching or custom resolver approaches). It’s commonly noted that requests/urllib3 themselves don’t cache DNS. [3][4]

Sources:
[1] urllib3.util.connection.create_connection uses socket.getaddrinfo() (no library DNS cache) — https://dokk.org/documentation/urllib3/1.6/_modules/urllib3/util/connection/ (dokk.org)
[2] Requests docs: keep-alive is automatic within a Session (connection pooling/reuse) — https://docs.python-requests.org/_/downloads/en/v2.7.0/pdf/ (docs.python-requests.org)
[3] AWS re:Invent slide explicitly stating requests/urllib3 don’t cache DNS — https://reinvent.awsevents.com/content/dam/reinvent/2024/slides/svs/SVS218-NEW_Accelerate-Python-and-NET-Lambda-functions-with-SnapStart.pdf (reinvent.awsevents.com)
[4] Discussion noting requests (via urllib3) calls getaddrinfo for hostname resolution and suggests OS/app-level DNS caching if needed — https://stackoverflow.com/questions/45479393/connection-error-while-using-requests-to-get-response-from-google-distance-matri (stackoverflow.com)

Citations:


The SSRF guard is bypassable via DNS rebinding—a real TOCTOU window exists in the validation approach.

validate_url_ip() resolves the hostname via socket.getaddrinfo() to check the IP, but requests.get() must resolve the same hostname again when opening the socket. Because urllib3 (underlying requests) performs no DNS caching and each socket creation triggers an independent resolver call, an attacker-controlled DNS server can return a public IP during validation, then serve a private IP on the next lookup—allowing the actual request to hit internal addresses. This same TOCTOU gap applies to each redirect hop (line 106 validates, line 108-110 requests). A proper fix requires either binding the socket connect to the pre-validated IP address or enforcing egress controls outside the application layer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/plane/bgtasks/work_item_link_task.py` around lines 91 - 107, The
current validate_url_ip() check is vulnerable to DNS rebinding TOCTOU because
requests/resolver is called again when opening sockets; fix by making the HTTP
connection use the pre-resolved IP instead of letting requests re-resolve
hostnames: resolve the hostname once (reuse validate_url_ip() to return the
vetted IP for each URL/hop), and then perform requests by connecting to that IP
while setting the original hostname in the Host header (or using a custom
TransportAdapter/ConnectionPool that calls create_connection with the resolved
IP) so the socket connect cannot re-resolve; apply this for the initial fetch
and for each redirect hop in work_item_link_task.py (use the resolved_ip
returned by validate_url_ip() when issuing requests.get and when following
redirects), or alternatively enforce egress firewall rules outside the app if
binding to IP is not possible.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/api/plane/tests/unit/bg_tasks/test_work_item_link_task.py (1)

126-126: Minor: Add trailing newline.

The file should end with a newline character for POSIX compliance and to avoid diff noise in future commits.

📝 Suggested fix
         assert response is final_resp
         assert not response.is_redirect
+
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/api/plane/tests/unit/bg_tasks/test_work_item_link_task.py` at line 126,
Add a POSIX trailing newline to the end of the test file containing the
assertion "assert not response.is_redirect" (test_work_item_link_task.py) by
ensuring the file ends with a single '\n' character so editors and git show no
spurious diffs; simply append the newline after the final line and save.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@apps/api/plane/tests/unit/bg_tasks/test_work_item_link_task.py`:
- Line 126: Add a POSIX trailing newline to the end of the test file containing
the assertion "assert not response.is_redirect" (test_work_item_link_task.py) by
ensuring the file ends with a single '\n' character so editors and git show no
spurious diffs; simply append the newline after the final line and save.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 582b6412-227a-4906-92ea-6d6e6eda63bf

📥 Commits

Reviewing files that changed from the base of the PR and between 40e4484 and 6cdaf2f.

📒 Files selected for processing (2)
  • apps/api/plane/bgtasks/work_item_link_task.py
  • apps/api/plane/tests/unit/bg_tasks/test_work_item_link_task.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/api/plane/bgtasks/work_item_link_task.py

@sriramveeraghanta sriramveeraghanta merged commit 63fac3b into preview Apr 6, 2026
13 checks passed
@sriramveeraghanta sriramveeraghanta deleted the fix/ssrf-favicon-redirect-validation branch April 6, 2026 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants