Skip to content

fix: null client_connection in Finding 2 eviction path + OOM cleanup fault-injection#7

Draft
Copilot wants to merge 8 commits into
copilot/research-session-handling-vulnerabilitiesfrom
copilot/fix-session-hijack-issue-1
Draft

fix: null client_connection in Finding 2 eviction path + OOM cleanup fault-injection#7
Copilot wants to merge 8 commits into
copilot/research-session-handling-vulnerabilitiesfrom
copilot/fix-session-hijack-issue-1

Conversation

Copy link
Copy Markdown

Copilot AI commented May 6, 2026

  • Add item->client_connection = NULL in Finding 2 eviction path inside ngx_http_upstream_get_ntlm_peer (via ngx_http_upstream_ntlm_item_release_to_free helper, which consistently nulls both connection pointers)
  • Add NGX_NTLM_TEST_CLEANUP_NULL fault-injection mechanism (implemented as ngx_http_upstream_ntlm_pool_cleanup_add wrapper function that returns NULL when the flag is set, replacing the direct ngx_pool_cleanup_add call in ngx_http_upstream_free_ntlm_peer)
  • Add TEST 7 to t/001-sanity.t — OOM in cleanup handler must not cache the upstream connection (marked --- SKIP for normal CI; requires a build with -DNGX_NTLM_TEST_CLEANUP_NULL; includes build instructions in a comment block above the test)
  • Add TEST 6 to t/001-sanity.t — new NTLM credentials on an established connection must evict the old session
Original prompt

Context

PR #6 (fix(security): prevent session hijack via cleanup OOM and stale credential reuse) introduced two security fixes. A code review identified two remaining issues that need to be addressed in a follow-up:


Issue 1 — Missing client_connection = NULL in Finding 2 eviction path

In ngx_http_upstream_get_ntlm_peer (Finding 2 fix), when evicting a stale authenticated session due to new credentials, item->client_connection is not cleared before the item is returned to the free list:

// Current code (lines ~278-284 in ngx_http_upstream_ntlm_module.c)
item->in_cache = 0;
item->peer_connection = NULL;
// ← item->client_connection = NULL; is MISSING
ngx_queue_remove(q);
ngx_queue_insert_head(&hndp->conf->free, q);
ngx_http_upstream_ntlm_close(c);
break;

This is inconsistent with the Finding 1 fix in ngx_http_upstream_free_ntlm_peer which correctly nulls all three fields:

item->in_cache = 0;
item->peer_connection = NULL;
item->client_connection = NULL;  // ← correctly present here

The struct comment (the in_cache ownership invariant block) documents that every code path moving an item out of conf->cache MUST null out both connection pointers. This is violated in the Finding 2 path.

Fix: Add item->client_connection = NULL; immediately after item->peer_connection = NULL; in the Finding 2 eviction block inside ngx_http_upstream_get_ntlm_peer.


Issue 2 — No test coverage for Finding 1 (OOM cleanup handler path)

The Finding 1 fix (OOM in ngx_pool_cleanup_add) has zero automated test coverage. Add a compile-time fault-injection mechanism and a corresponding test.

Fault injection macro

In ngx_http_upstream_ntlm_module.c, add a compile-time shim so that test builds can force ngx_pool_cleanup_add to return NULL:

#ifdef NGX_NTLM_TEST_CLEANUP_NULL
#  undef  ngx_pool_cleanup_add
#  define ngx_pool_cleanup_add(pool, size) NULL
#endif

Place this immediately before the ngx_http_upstream_free_ntlm_peer function definition (or near the top of the file after the includes), so it only affects that translation unit.

New test in t/001-sanity.t

Add TEST 6 to t/001-sanity.t that documents the expected behavior when the cleanup handler cannot be allocated. The test should be marked skip unless the fault-injection build is used, but should clearly document what the test verifies:

=== TEST 6: OOM in cleanup handler must not cache the upstream connection
When ngx_pool_cleanup_add returns NULL, the upstream session MUST NOT be
inserted into the cache.  A subsequent request on the same client connection
must therefore NOT receive the old authenticated upstream session.
This test requires a build with -DNGX_NTLM_TEST_CLEANUP_NULL.
--- SKIP
--- http_config
    upstream backend {
        server 127.0.0.1:19841;
        ntlm;
    }
--- config
    location /t {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
--- pipelined_requests eval
["GET /t", "GET /t"]
--- more_headers eval
["Authorization: NTLM " . $::random_token, "Authorization: NTLM " . $::random_token]
--- response_body eval
["OK", "OK"]
--- error_log
ntlm: failed to allocate cleanup handler
--- response_headers eval
["X-NGX-NTLM-AUTH: " . $::random_token, "X-NGX-NTLM-AUTH: " . $::random_token]

Also add a brief comment block above TEST 6 explaining how to build nginx with the fault-injection flag for local testing.


Summary of changes required

  1. ngx_http_upstream_ntlm_module.c: Add item->client_connection = NULL; in the Finding 2 eviction block in ngx_http_upstream_get_ntlm_peer, and add the NGX_NTLM_TEST_CLEANUP_NULL fault-injection macro block.
  2. t/001-sanity.t: Add TEST 6 with the OOM scenario documentation and skip guard.

The base branch for this PR is copilot/research-session-handling-vulnerabilities (the branch from PR #6).

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: what do you think about this pr. do you see any problems?

Assistant: Here's my full review of PR #6:


📋 PR Summary

fix(security): prevent session hijack via cleanup OOM and stale credential reuse

State 🟡 Open (Draft)
Author Copilot (bot)
Files Changed 2 (ngx_http_upstream_ntlm_module.c, t/001-sanity.t)
Changes +78 / -5 lines
Reviews None yet
CI Not reported
Mergeable ✅ Clean

This PR addresses two distinct security vulnerabilities in the NTLM upstream session cache:

  1. Finding 1 (Critical): Session hijack via OOM in cleanup handler registration
  2. Finding 2 (High): Stale credential reuse on an established keep-alive connection

🔧 Core Changes

Finding 1 — Cleanup Handler OOM (ngx_http_upstream_free_ntlm_peer)

Previously, if `ngx_pool_cle...

This pull request was created from Copilot chat.

… OOM fault-injection test

Issue 1: In ngx_http_upstream_get_ntlm_peer, the Finding 2 eviction block
was missing `item->client_connection = NULL;` after `item->peer_connection = NULL;`.
This violates the ownership invariant documented in the struct comment and is
inconsistent with the Finding 1 fix in ngx_http_upstream_free_ntlm_peer.

Issue 2: Added NGX_NTLM_TEST_CLEANUP_NULL compile-time fault-injection macro
immediately before ngx_http_upstream_free_ntlm_peer. When nginx is built with
-DNGX_NTLM_TEST_CLEANUP_NULL, ngx_pool_cleanup_add is overridden to always
return NULL, allowing the OOM cleanup handler path to be exercised in tests.

Added TEST 6 to t/001-sanity.t documenting the expected behavior when the
cleanup handler allocation fails. The test is marked SKIP so it does not run
in normal CI; it requires the fault-injection build described in the comment
block above the test.

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-module/sessions/80daf827-e913-4352-84db-d992d1c15ab0

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix missing client connection nullification in eviction path fix: null client_connection in Finding 2 eviction path + OOM cleanup fault-injection May 6, 2026
Copilot AI requested a review from matthias-lay May 6, 2026 14:40
Copilot AI and others added 6 commits May 7, 2026 07:16
Copilot AI added a commit that referenced this pull request May 7, 2026
- Rewrites ngx_http_upstream_ntlm_module.c with four documented security invariants:
  I1. Atomic item release via ngx_http_upstream_ntlm_item_release() helper
  I2. c->read->data (not c->data) for cache item storage on idle connections
  I3. OOM guard: cleanup-add failure aborts cache insertion (session-hijack fix)
  I4. Stale-credential eviction on re-auth with established session

- Fixes from all open PRs (#3, #6, #7):
  - Synchronous cleanup handler (no posted events)
  - Session-hijack via cleanup-OOM (critical)
  - Stale-credential reuse on established connections (high)
  - client_connection NULL in eviction path
  - Fault-injection macro NGX_NTLM_TEST_CLEANUP_NULL for testing

- Adds ntlm_time and ntlm_requests directives
- Adds notify peer callback pass-through
- Updates README.md for v2 with security section
- Updates Docker files to nginx 1.28.0 / alpine 3.20
- Adds TEST 5 (stale credential eviction) and TEST 6 (OOM guard, SKIP)
  to t/001-sanity.t

Agent-Logs-Url: https://github.com/Securepoint/nginx-ntlm-module/sessions/3343cf49-d4a7-4188-89b8-5a89fad4a7fe

Co-authored-by: matthias-lay <163420385+matthias-lay@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants