Skip to content

Dynamic CollapsConfigs for OpenEvents and EndpointOpens that asymptotically behave like the old node-agent#3

Merged
entlein merged 44 commits into
mainfrom
test/localtestbuild
Feb 15, 2026
Merged

Dynamic CollapsConfigs for OpenEvents and EndpointOpens that asymptotically behave like the old node-agent#3
entlein merged 44 commits into
mainfrom
test/localtestbuild

Conversation

@entlein
Copy link
Copy Markdown

@entlein entlein commented Feb 1, 2026

We got a few issues here:

  • adding support for the asterix char as wildcard
  • allowing different configs for path prefixes on how they should collapse
  • making sure that the default config of node-agent will mimick the old one and not break users experience
  • two fixes: the user-defined-profiles were not being properly stored (bug)
  • absolute paths in the open-tracer gadget -> this is currently worked on in a seperate branch until we know how much fallout it has (it s over in the node-agent fork)

Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 2, 2026

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

Signed-off-by: entlein <einentlein@gmail.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 2, 2026

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
…re test cases

Signed-off-by: entlein <einentlein@gmail.com>
…re test cases

Signed-off-by: entlein <einentlein@gmail.com>
…the rest

Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

4 similar comments
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

@entlein entlein changed the title first try - this could go bad - regex v0.0.1 Dynamic CollapsConfigs for OpenEvents and EndpointOpens that asymptotically behave like the old node-agent Feb 14, 2026
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

2 similar comments
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

@entlein
Copy link
Copy Markdown
Author

entlein commented Feb 15, 2026

This is looking really good now. (we still have the relative path issue, but that s likely over in node-agent)
webapp-profile.yaml

@entlein entlein force-pushed the test/localtestbuild branch from fff144b to 9a0a045 Compare February 15, 2026 19:55
entlein and others added 10 commits February 15, 2026 20:56
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Signed-off-by: entlein <einentlein@gmail.com>
Add push trigger on test/localtestbuild branch. After building the
storage image, the workflow now triggers node-agent's build.yaml
with the same IMAGE_TAG via CROSS_REPO_PAT secret.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: entlein <einentlein@gmail.com>
Add check to skip user-managed resources in cleanup.

Signed-off-by: Duck <70207455+entlein@users.noreply.github.com>
Signed-off-by: entlein <einentlein@gmail.com>
@entlein entlein force-pushed the test/localtestbuild branch from 9a0a045 to 3231e28 Compare February 15, 2026 19:56
@entlein entlein merged commit 7bf198c into main Feb 15, 2026
5 checks passed
@entlein entlein deleted the test/localtestbuild branch February 15, 2026 19:59
@github-actions
Copy link
Copy Markdown

Summary:

  • License scan: failure
  • Credentials scan: skipped
  • Vulnerabilities scan: skipped
  • Unit test: success
  • Go linting: failure

entlein pushed a commit that referenced this pull request May 16, 2026
…e accessor

CodeRabbit upstream PR kubescape#326 finding #3
(applicationprofile_processor.go:92). A zero-valued processor —
constructed via direct struct literal rather than the
NewApplicationProfileProcessor factory — has a nil collapseSettings
function field. The deflate path then nil-dereferences on the call
inside PreSave.

Adds `effectiveCollapseSettings()` as the safe accessor used by
PreSave. It returns the provider's result when configured, or the
compiled-in DefaultCollapseSettings when the field is nil. The
direct field-call path (`a.collapseSettings()`) is intentionally
left raw — callers who construct a zero-value processor and call
the unsafe path explicitly are on the "I know what I'm doing" side.

New regression test TestApplicationProfileProcessor_ZeroValue_NoPanicOnCollapseSettings
pins both halves of the contract:
  - effectiveCollapseSettings() on zero-value processor does NOT
    panic and returns the compiled-in defaults
  - raw field-call on zero-value processor DOES still panic (no
    silent behaviour change)
entlein pushed a commit that referenced this pull request May 16, 2026
…opy)

CodeRabbit upstream PR kubescape#323 finding #3. The package-level
`DefaultCollapseConfig` was an exported mutable var: any caller doing
`dynamicpathdetector.DefaultCollapseConfig.Threshold = 1` would
silently corrupt every analyzer constructed thereafter. The threshold
surface is security-sensitive (governs how aggressively profile paths
collapse into wildcards), so accidental cross-caller leakage is a
real risk.

Replaces the exported var with an unexported `defaultCollapseConfig`
and an accessor `DefaultCollapseConfig()` that returns a value copy.
Pattern mirrors the existing `defaultCollapseConfigs` / `DefaultCollapseConfigs()`
defensive-copy accessor for the slice form.

Updates 6 callsites across the test suite to use the accessor.
Adds TestDefaultCollapseConfig_DefensiveCopy pinning the contract: a
caller mutating the returned struct does not affect subsequent reads.
entlein added a commit that referenced this pull request May 16, 2026
…ings (#33)

* ci(manual-integration-tests): disable workflow until timing-fragile tests are fixed

Adds a fail-fast guard step at the start of the integration-tests job
that exits with a clear error pointing to issue #32. The workflow stays
visible in the UI but anyone who dispatches it gets an immediate failure
rather than burning ~30 minutes of CI on a known-flaky run.

Last 5 fork runs (between 2026-05-15 16:32Z and 17:59Z on a15fb5a)
each failed on a different *ProfileCreate subtest at the AP-completion
polling stage. Upstream main MIT also red across its last 5 runs. The
failover-tests jobs are consistently green and remain enabled by default
(they run in their own job, untouched by this change).

Re-enable when the wall-clock learning-period assumptions are replaced
with dynamic AP-status polling and 5 consecutive green runs land.

* fix(dynamicpathdetector): DP-memoise compareSegments to bound multi-* backtracking

CodeRabbit upstream PR kubescape#326 finding #4 (Major). The recursive expansion
at the mid-path `*` branch explored all suffix splits without caching,
producing 2^n / n!-style runtime on patterns with multiple `*`
segments. Adversarial inputs of 20 `*` + 20 regular segments time out
the test runner at 60s on the un-memoised matcher.

Approach: keep the existing recursive form (compareSegments) on the
fast path for single-`*` and no-`*` patterns where backtracking can't
re-enter the same (di, ri) state, and route multi-`*` patterns through
a DP-memoised core (compareSegmentsMemo) that caches every (di, ri)
outcome. Semantics are identical — the memoised core is a pure
re-shape of the same recursion.

The split is gated by multipleWildcards(), a single linear scan of
the dynamic-segment slice. This keeps the common-case allocation
profile at 2 allocs/op (splitPath only); the memo map is only
allocated when it actually earns its keep.

Local benchmark on this machine (arm64, -benchtime=2s):

  Shape                              Before          After
  ---------------------------------- --------------- ---------------
  ellipsis_short                     94.26 ns/op     90.08 ns/op
  ellipsis_deep                     158.3  ns/op    171.1  ns/op
  trailing_star                      86.21 ns/op    103.5  ns/op
  trailing_star_no_match_on_parent   73.02 ns/op     73.35 ns/op
  mid_star_zero_consumed             84.76 ns/op     82.26 ns/op
  mid_star_many_consumed            118.4  ns/op    117.5  ns/op
  anchored_root_no_match             55.66 ns/op     53.30 ns/op
  unanchored_star_root               49.30 ns/op     55.30 ns/op
  deep_literal_match                 93.40 ns/op     98.41 ns/op
  deep_literal_mismatch             108.1  ns/op     96.43 ns/op

  All shapes: 2 allocs/op before and after (no heap growth).

  Adversarial multi-* tail-mismatch:    60s+ timeout → 0.2s wall
                                                       (≥ 300× faster).

Tests added in tests/compare_dynamic_memoise_test.go:

  - TestCompareDynamic_MemoiseGoldenAcceptance: 27 paired
    accept/reject cases pinning byte-for-byte semantic parity
    (empty inputs, literal exact, trailing `*`, mid `*`, ellipsis
    `⋯`, consecutive `*/*` zero-or-more semantics, busybox-style
    symlink shape, multi-wildcard, root edge).
  - TestCompareDynamic_MemoiseAdversarialReDoS: 4 multi-`*`
    tail-mismatch shapes with strict time budgets (50/100/200/200ms);
    each fails the test if the matcher takes longer than the budget,
    catching any future regression that re-introduces exponential
    backtracking.
  - TestCompareDynamic_MemoiseAdversarialPositive: positive-case
    parity — a matching adversarial input completes in <200ms.
  - TestCompareDynamic_MemoiseAllocCeiling: per-call allocs ≤ 12
    across the common shapes (current observation: 2 allocs/op),
    catching unbounded heap growth in future memoisation tweaks.

The adversarial tests skip under testing.Short(); the alloc-ceiling
test also skips so quick local runs stay fast.

* test(analyze_endpoints): require equal length before indexed comparisons

CodeRabbit upstream PR kubescape#326 outside-diff-range finding at
analyze_endpoints_test.go:169-178. Ranging over result and indexing
tt.expected[i] silently passes if result is shorter than expected,
making missing endpoints invisible to the gate. Adds require.Len()
before the for-loop so length divergence fails the test loudly.

* test(compare_exec_args): skip ReDoS timing test under testing.Short

CodeRabbit upstream PR kubescape#326 finding #5. The 100ms wall-clock assertion
in TestCompareExecArgs_ReDoSResistance is sensitive to runner CPU
contention and can fail flakily even when the memoised matcher is
correct. Skipping under testing.Short() preserves the regression
intent (the test still runs by default) but lets quick local /
short-mode CI runs stay deterministic. Memoisation correctness is
already covered by the explicit case-table tests above, which always
run regardless of short mode.

* fix(applicationprofile_processor): nil-guard collapseSettings via safe accessor

CodeRabbit upstream PR kubescape#326 finding #3
(applicationprofile_processor.go:92). A zero-valued processor —
constructed via direct struct literal rather than the
NewApplicationProfileProcessor factory — has a nil collapseSettings
function field. The deflate path then nil-dereferences on the call
inside PreSave.

Adds `effectiveCollapseSettings()` as the safe accessor used by
PreSave. It returns the provider's result when configured, or the
compiled-in DefaultCollapseSettings when the field is nil. The
direct field-call path (`a.collapseSettings()`) is intentionally
left raw — callers who construct a zero-value processor and call
the unsafe path explicitly are on the "I know what I'm doing" side.

New regression test TestApplicationProfileProcessor_ZeroValue_NoPanicOnCollapseSettings
pins both halves of the contract:
  - effectiveCollapseSettings() on zero-value processor does NOT
    panic and returns the compiled-in defaults
  - raw field-call on zero-value processor DOES still panic (no
    silent behaviour change)

* fix(proto): declare repeated string ipAddresses = 9 on NetworkNeighbor

CodeRabbit upstream PR kubescape#326 finding #2 (Major).
The Go struct's protobuf tag is `bytes,9,rep,name=ipAddresses` but
generated.proto did not declare field 9 on the NetworkNeighbor
message. Runtime behaviour was unaffected because generated.pb.go is
the actual codec and already encodes field 9 correctly (verified:
wire tag 0x4a = field 9, wire type 2). But the .proto file is the
source of truth for:

  - Any future regeneration (`make generate`) — drops the field if
    the .proto doesn't declare it
  - Cross-language consumers compiling against the .proto schema
  - Documentation accuracy

Restores the declaration. New regression test
TestNetworkNeighbor_ProtoFile_DeclaresIPAddresses parses generated.proto
and asserts the NetworkNeighbor message body contains the field-9
declaration. If the .proto and the Go tag ever diverge again, the
test fails loudly at unit-test time rather than at the next regen.

The existing IPAddresses protobuf roundtrip tests are unaffected and
continue to pass.

* fix(dynamicpathdetector): DefaultCollapseConfig accessor (defensive copy)

CodeRabbit upstream PR kubescape#323 finding #3. The package-level
`DefaultCollapseConfig` was an exported mutable var: any caller doing
`dynamicpathdetector.DefaultCollapseConfig.Threshold = 1` would
silently corrupt every analyzer constructed thereafter. The threshold
surface is security-sensitive (governs how aggressively profile paths
collapse into wildcards), so accidental cross-caller leakage is a
real risk.

Replaces the exported var with an unexported `defaultCollapseConfig`
and an accessor `DefaultCollapseConfig()` that returns a value copy.
Pattern mirrors the existing `defaultCollapseConfigs` / `DefaultCollapseConfigs()`
defensive-copy accessor for the slice form.

Updates 6 callsites across the test suite to use the accessor.
Adds TestDefaultCollapseConfig_DefensiveCopy pinning the contract: a
caller mutating the returned struct does not affect subsequent reads.

* docs(dynamicpathdetector): comment + guard kubescape#323 nitpicks (#5, #6, #9, #10)

Four small docstring / defensive-guard improvements from CodeRabbit
upstream PR kubescape#323:

  #5  wildcardKey / getEndpointKey format coupling — extract
      buildEndpointKey as the single source of truth for the lookup
      key shape. Both call-sites now route through it, removing the
      "must stay in sync" risk that allowed past drift. New test
      TestBuildEndpointKey_SharedFormat pins end-to-end behaviour via
      MergeDuplicateEndpoints: a wildcard-port entry and a specific-
      port sibling collapse into one row with merged Methods.

  #6  consolidateOpens O(n²) — acknowledged in the function's
      docstring along with the threshold (≤ ~10k entries) above which
      a trie-based replacement would be worthwhile. Behaviour
      unchanged.

  #9  removeEndpoint in-place mutation — added a NOTE block in the
      docstring describing the backing-array shift, why the sole
      caller is safe today, and the swap-to-copy guidance for any
      future caller that stores intermediate slice references.

  #10 hasPrefixAtBoundary("") edge — added an explicit empty-prefix
      branch returning true. Without it the function fell through to
      pathPrefix[0] == '/', which is true for any absolute path —
      effectively treating "" as a root-matching prefix incidentally.
      The explicit branch makes the invariant load-bearing.
      New test TestHasPrefixAtBoundary_EmptyPrefix pins the surface
      via FindConfigForPath.

All existing tests still pass. Local run on this arm64 machine:
  go test ./pkg/registry/file/... ok in 0.93s

* deps: bump syft to v1.42.4 (CVE-2026-33481), stereoscope v0.1.22, runtime-spec v1.3.0

CodeRabbit upstream PR kubescape#326 finding #1 (Critical). Lifts syft from
v1.32.0 to v1.42.4 — the lowest version that includes the CVE-2026-33481
fix (Syft was failing to clean up temporary storage when a scan
errored mid-flight). v1.42.4 is the latest patch on the v1.42 line, so
the upgrade gets the CVE fix without taking the v1.43/v1.44 minor
bumps' larger surface-area changes.

go mod tidy pulled forward two transitive bumps:
  - stereoscope v0.1.9 → v0.1.22 (Anchore's syft-companion lib, must
    move in lockstep with syft)
  - runtime-spec v1.2.1 → v1.3.0 (transitive via stereoscope)
  - modernc.org/libc v1.67.6 → v1.70.0 (transitive)
  - modernc.org/sqlite v1.46.1 → v1.46.2 (transitive)

go.mod no longer carries a replace directive against kubescape/syft —
our fork doesn't have v1.42.3+ tags and the upstream content is
required for the CVE fix. Direct anchore/syft dependency is now the
source of truth.

Local validation:
  go build ./...          ok
  go test ./pkg/... ok in 1.4s (all 16 packages green)

Vendor directory is .gitignored; `go mod vendor` was rerun locally
to keep build/manifest in sync.

* fix: address 3 follow-up rabbit findings on PR #33

1. ci(manual-integration-tests): use job-level if: false instead of
   exit 1 — workflow shows as "skipped" not "failed red X" in the
   Actions UI when dispatched, removing the cosmetic noise while
   keeping the ~30-minute waste prevention. Issue #32 reference kept
   in the YAML comment above the conditional.

2. test(network_types_protobuf): also assert the Go struct's protobuf
   tag, not just the .proto-file text. The earlier test only checked
   one side of the schema/tag contract, so tag drift on the Go side
   would silently pass. Now both halves are pinned.

3. test(applicationprofile_processor): assert the FULL
   CollapseSettings struct (not just OpenDynamicThreshold). Without
   this, a regression that resets any field added to
   CollapseSettings in the future would silently pass the zero-value
   defensive-fallback test.

All from CodeRabbit follow-up review on storage PR #33. Local validation:
  go test ./pkg/registry/file/ ./pkg/apis/softwarecomposition/v1beta1/
  → both packages ok

* docs(networkmatch): correct spec URL to billofbehavior.com

---------

Co-authored-by: Entlein <eineintlein@gmail.com>
entlein added a commit that referenced this pull request May 16, 2026
…eanup) (#31)

* fix(cleanup): enhance pod deletion logic and add tests for standalone pods

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

* feat(networkmatch): IP matcher with literal + CIDR + '*' sentinel

Runtime counterpart to spec §5.7 (v0.0.2). Profile entries are matched
disjunctively — any single entry hit means match. Each entry MAY be:

  - a literal IP (canonicalised by net.ParseIP, IPv4 ↔ ::ffff:v4 unified)
  - a CIDR (a.b.c.d/n, validated by net.ParseCIDR)
  - the '*' sentinel for any IP (sugar for 0.0.0.0/0 ∪ ::/0)

Public API:

  CompileIP([]string) *IPMatcher    // compile-once for hot paths
  (*IPMatcher).Match(string) bool   // amortised match
  MatchIP([]string, string) bool    // cold-path convenience

The compiled-matcher pattern eliminates per-call allocs from net.ParseCIDR
on the hot path — node-agent's nn.* CEL functions call this on every
network event captured by the eBPF tracer.

Tests cover:
  - literal equality (incl. IPv4/IPv6 canonicalisation, ::ffff: mapping)
  - CIDR membership (incl. /32 == literal, edge addresses)
  - '*' sentinel (and rejection of empty/garbage observations)
  - 0.0.0.0/0 + ::/0 RFC-aligned alternatives (do NOT cross address families)
  - malformed entry skipping (admission validates, runtime defends)
  - mixed-list disjunction

Benchmarks (arm64, hot-path compiled form):
  Literal       : 12.73 ns/op   0 allocs
  CIDR          : 18.48 ns/op   0 allocs
  10-entry miss : 59.10 ns/op   0 allocs

All well under the README target of <200 ns/op for IP literal/CIDR.

* feat(networkmatch): DNS matcher with leading/trailing star + mid ellipsis

Runtime counterpart to spec §5.8 (v0.0.2). Profile entries are matched
disjunctively against an observed FQDN. Each entry's labels are walked
left-to-right with positional token semantics:

  literal label   : byte-equality (case-insensitive)
  leading '*'     : matches EXACTLY ONE label (RFC 4592)
  trailing '*'    : matches ONE OR MORE labels (project extension)
  '⋯' anywhere    : matches EXACTLY ONE label (DynamicLabel)
  '**'            : reserved/recursive — admission rejects, runtime drops

Trailing dot is normalised on both sides (entry MAY or MAY NOT have it).
Empty inner labels (e.g. 'foo..bar') are treated as malformed and skipped.

Public API:

  CompileDNS([]string) *DNSMatcher
  (*DNSMatcher).Match(string) bool
  MatchDNS([]string, string) bool   // cold-path convenience

Tests cover:
  - literal equality (case-insensitive, exact-label-count)
  - trailing-dot normalisation in all four combinations
  - RFC 4592 leading '*' (exactly one label)
  - mid '⋯' DynamicLabel (the kubernetes service-FQDN case)
  - trailing '*' extension (one or more, never zero)
  - disjunction across entry list
  - malformed input rejection ('**', empty labels, lone '*')
  - compiled-form reuse contract

Benchmarks (arm64, hot-path compiled form):
  Literal           : 47.68 ns/op   1 alloc (label-split)
  Leading wildcard  : 53.20 ns/op   1 alloc
  Deep name (10 lbl): 144.9 ns/op   1 alloc
  Long mixed list   : 103.3 ns/op   1 alloc

All under the README target of <600 ns/op for DNS wildcards.

* feat(api): NetworkNeighbor.IPAddresses field + admission validation

Adds IPAddresses []string to the NetworkNeighbor schema (internal +
v1beta1) as the v0.0.2 list-form replacement for the deprecated singular
IPAddress field. Each entry MAY be a literal IP, a CIDR, or '*'.

Hand-edited generated code (deepcopy, conversion, protobuf) because
the codegen pipeline isn't run as part of build; verified by a
new round-trip test that pins the protobuf wire contract for field
number 9.

Validation lives in two places:

  pkg/registry/file/networkmatch/validate.go
    - ValidateIPEntry  : literal | CIDR | '*'
    - ValidateDNSEntry : literal | leading-* | trailing-* | mid-⋯
                         rejects '**', empty inner labels, mid-* without ⋯

  pkg/registry/softwarecomposition/networkneighborhood/strategy.go
    REST Validate() walks every neighbor in containers/initContainers/
    ephemeralContainers × egress/ingress and surfaces a field.Invalid
    error per malformed entry. Apiserver translates this into a 400
    so the user gets immediate feedback at write time.

Runtime matchers (MatchIP/MatchDNS) keep their tolerance of malformed
entries — a misconfigured profile that slips past admission won't kill
the detection path.

New tests:
  - protobuf round-trip (field 9 survives Marshal → Unmarshal)
  - ValidateIPEntry / ValidateDNSEntry edge cases
  - REST strategy Validate rejects malformed entries with the right
    field paths

* chore: gitignore .claude agent state directory

* fix(networkmatch): address CodeRabbit review on PR #30

Five findings, all legit, all fixed:

- ValidateUpdate now also runs validateNetworkProfileEntries, so
  malformed ipAddresses / dnsNames can't land via PUT after a clean
  POST. New TestValidateUpdate_NetworkProfileEntries pins this.

- validateNeighborList now also validates the deprecated singular
  IPAddress field while it remains supported. Bypass via the old form
  is closed.

- TestValidate_NetworkProfileEntries asserts each error's field path
  (multiset, order-insensitive) instead of just the count. Field-path
  contract is now pinned: if validation starts emitting errors on the
  wrong path, downstream tooling that surfaces these to users will
  catch it here first. New case for deprecated-singular path.

- TestNetworkNeighbor_IPAddresses_EmptyOmitted now checks Marshal
  errors with t.Fatalf instead of swallowing them via _.

- README links to spec-v0.0.2 (was v0.0.1) and DNS normalisation
  wording matches runtime: trailing dot stripped, labels lowercased
  (was incorrectly described as 'appended').

* fix(networkmatch): address CodeRabbit round 2 on PR #30

Three findings, two fixed, one skipped:

- validateNetworkProfileEntries now iterates an ordered []struct of
  (groupName, items) pairs rather than a map. Go map iteration is
  non-deterministic and admission errors flow to clients via the
  apiserver — stable ordering keeps error messages reproducible.

- TestValidateUpdate_NetworkProfileEntries asserts each error's field
  path (multiset, order-insensitive). Same pattern already applied to
  TestValidate_NetworkProfileEntries; this closes the parallel gap.

Skipped:

- README.md "canonicalisation" → "canonicalization" (Trivial nit).
  Project already uses British spelling consistently across other docs;
  changing one occurrence creates inconsistency without fixing
  anything.

Tests pass on the new ordering — note the test uses an order-insensitive
multiset comparison, so the deterministic-order fix is a separate
concern from the test (the test would pass with either ordering, but
the runtime is now stable for downstream consumers).

* fix(networkmatch): validate deprecated DNS singular field (CR round 3)

Closes a parity gap with IPAddress (which round-1 already validates).
Now both deprecated singular fields receive the same admission grammar
check while they remain supported.

Test extended with a deprecated-DNS path case mirroring the existing
IPAddress one.

* style(networkmatch): use American English spelling consistently

Reverses the round-2 skip on the CodeRabbit nitpick (PR #30, README
line 17). Files added in this branch now use American spelling
(canonicalize/normalize/behavior) matching the Go ecosystem convention
that 'canonicalisation' originally diverged from.

Files touched:
- pkg/registry/file/networkmatch/README.md
- pkg/registry/file/networkmatch/match_ip.go (comment)
- pkg/registry/file/networkmatch/match_dns.go (comment)
- pkg/registry/file/networkmatch/match_ip_test.go (comment)

No behaviour change.

* test(dynamicpathdetector): pin bare-name argv[0] contract for CompareExecArgs

Storage-side recording stores execs as [resolved_path, ...argv] and
returns to consumers as (Path=resolved, Args=full-argv-with-argv[0]).
The matcher does strict positional compare with no special argv[0]
normalisation. That means profile.Args[0] MUST equal runtime argv[0]
as captured by eBPF — typically the BARE program name ("sh"), not
the resolved exepath ("/bin/sh").

The realistic-patterns table earlier in this file uses 2-element
vectors that skip argv[0] entirely, which doesn't pin the convention.
Add TestCompareExecArgs_Argv0BareName covering all the Test_32
scenarios end-to-end against the matcher, plus the two
profile-shape-mismatch cases (Args[0]=full-path vs bare argv[0], and
vice versa) so future drift on either side trips the test.

Pairs with the rewrite of node-agent's Test_32_UnexpectedProcessArguments
that changes profile.Args from [/bin/sh, -c, *] to [sh, -c, *], so
R0040 can be tested in isolation from R0001's path-resolution path.

* release/sbob-rc1: TDD fixes for upstream PR kubescape#326 rabbit findings (#33)

* ci(manual-integration-tests): disable workflow until timing-fragile tests are fixed

Adds a fail-fast guard step at the start of the integration-tests job
that exits with a clear error pointing to issue #32. The workflow stays
visible in the UI but anyone who dispatches it gets an immediate failure
rather than burning ~30 minutes of CI on a known-flaky run.

Last 5 fork runs (between 2026-05-15 16:32Z and 17:59Z on a15fb5a)
each failed on a different *ProfileCreate subtest at the AP-completion
polling stage. Upstream main MIT also red across its last 5 runs. The
failover-tests jobs are consistently green and remain enabled by default
(they run in their own job, untouched by this change).

Re-enable when the wall-clock learning-period assumptions are replaced
with dynamic AP-status polling and 5 consecutive green runs land.

* fix(dynamicpathdetector): DP-memoise compareSegments to bound multi-* backtracking

CodeRabbit upstream PR kubescape#326 finding #4 (Major). The recursive expansion
at the mid-path `*` branch explored all suffix splits without caching,
producing 2^n / n!-style runtime on patterns with multiple `*`
segments. Adversarial inputs of 20 `*` + 20 regular segments time out
the test runner at 60s on the un-memoised matcher.

Approach: keep the existing recursive form (compareSegments) on the
fast path for single-`*` and no-`*` patterns where backtracking can't
re-enter the same (di, ri) state, and route multi-`*` patterns through
a DP-memoised core (compareSegmentsMemo) that caches every (di, ri)
outcome. Semantics are identical — the memoised core is a pure
re-shape of the same recursion.

The split is gated by multipleWildcards(), a single linear scan of
the dynamic-segment slice. This keeps the common-case allocation
profile at 2 allocs/op (splitPath only); the memo map is only
allocated when it actually earns its keep.

Local benchmark on this machine (arm64, -benchtime=2s):

  Shape                              Before          After
  ---------------------------------- --------------- ---------------
  ellipsis_short                     94.26 ns/op     90.08 ns/op
  ellipsis_deep                     158.3  ns/op    171.1  ns/op
  trailing_star                      86.21 ns/op    103.5  ns/op
  trailing_star_no_match_on_parent   73.02 ns/op     73.35 ns/op
  mid_star_zero_consumed             84.76 ns/op     82.26 ns/op
  mid_star_many_consumed            118.4  ns/op    117.5  ns/op
  anchored_root_no_match             55.66 ns/op     53.30 ns/op
  unanchored_star_root               49.30 ns/op     55.30 ns/op
  deep_literal_match                 93.40 ns/op     98.41 ns/op
  deep_literal_mismatch             108.1  ns/op     96.43 ns/op

  All shapes: 2 allocs/op before and after (no heap growth).

  Adversarial multi-* tail-mismatch:    60s+ timeout → 0.2s wall
                                                       (≥ 300× faster).

Tests added in tests/compare_dynamic_memoise_test.go:

  - TestCompareDynamic_MemoiseGoldenAcceptance: 27 paired
    accept/reject cases pinning byte-for-byte semantic parity
    (empty inputs, literal exact, trailing `*`, mid `*`, ellipsis
    `⋯`, consecutive `*/*` zero-or-more semantics, busybox-style
    symlink shape, multi-wildcard, root edge).
  - TestCompareDynamic_MemoiseAdversarialReDoS: 4 multi-`*`
    tail-mismatch shapes with strict time budgets (50/100/200/200ms);
    each fails the test if the matcher takes longer than the budget,
    catching any future regression that re-introduces exponential
    backtracking.
  - TestCompareDynamic_MemoiseAdversarialPositive: positive-case
    parity — a matching adversarial input completes in <200ms.
  - TestCompareDynamic_MemoiseAllocCeiling: per-call allocs ≤ 12
    across the common shapes (current observation: 2 allocs/op),
    catching unbounded heap growth in future memoisation tweaks.

The adversarial tests skip under testing.Short(); the alloc-ceiling
test also skips so quick local runs stay fast.

* test(analyze_endpoints): require equal length before indexed comparisons

CodeRabbit upstream PR kubescape#326 outside-diff-range finding at
analyze_endpoints_test.go:169-178. Ranging over result and indexing
tt.expected[i] silently passes if result is shorter than expected,
making missing endpoints invisible to the gate. Adds require.Len()
before the for-loop so length divergence fails the test loudly.

* test(compare_exec_args): skip ReDoS timing test under testing.Short

CodeRabbit upstream PR kubescape#326 finding #5. The 100ms wall-clock assertion
in TestCompareExecArgs_ReDoSResistance is sensitive to runner CPU
contention and can fail flakily even when the memoised matcher is
correct. Skipping under testing.Short() preserves the regression
intent (the test still runs by default) but lets quick local /
short-mode CI runs stay deterministic. Memoisation correctness is
already covered by the explicit case-table tests above, which always
run regardless of short mode.

* fix(applicationprofile_processor): nil-guard collapseSettings via safe accessor

CodeRabbit upstream PR kubescape#326 finding #3
(applicationprofile_processor.go:92). A zero-valued processor —
constructed via direct struct literal rather than the
NewApplicationProfileProcessor factory — has a nil collapseSettings
function field. The deflate path then nil-dereferences on the call
inside PreSave.

Adds `effectiveCollapseSettings()` as the safe accessor used by
PreSave. It returns the provider's result when configured, or the
compiled-in DefaultCollapseSettings when the field is nil. The
direct field-call path (`a.collapseSettings()`) is intentionally
left raw — callers who construct a zero-value processor and call
the unsafe path explicitly are on the "I know what I'm doing" side.

New regression test TestApplicationProfileProcessor_ZeroValue_NoPanicOnCollapseSettings
pins both halves of the contract:
  - effectiveCollapseSettings() on zero-value processor does NOT
    panic and returns the compiled-in defaults
  - raw field-call on zero-value processor DOES still panic (no
    silent behaviour change)

* fix(proto): declare repeated string ipAddresses = 9 on NetworkNeighbor

CodeRabbit upstream PR kubescape#326 finding #2 (Major).
The Go struct's protobuf tag is `bytes,9,rep,name=ipAddresses` but
generated.proto did not declare field 9 on the NetworkNeighbor
message. Runtime behaviour was unaffected because generated.pb.go is
the actual codec and already encodes field 9 correctly (verified:
wire tag 0x4a = field 9, wire type 2). But the .proto file is the
source of truth for:

  - Any future regeneration (`make generate`) — drops the field if
    the .proto doesn't declare it
  - Cross-language consumers compiling against the .proto schema
  - Documentation accuracy

Restores the declaration. New regression test
TestNetworkNeighbor_ProtoFile_DeclaresIPAddresses parses generated.proto
and asserts the NetworkNeighbor message body contains the field-9
declaration. If the .proto and the Go tag ever diverge again, the
test fails loudly at unit-test time rather than at the next regen.

The existing IPAddresses protobuf roundtrip tests are unaffected and
continue to pass.

* fix(dynamicpathdetector): DefaultCollapseConfig accessor (defensive copy)

CodeRabbit upstream PR kubescape#323 finding #3. The package-level
`DefaultCollapseConfig` was an exported mutable var: any caller doing
`dynamicpathdetector.DefaultCollapseConfig.Threshold = 1` would
silently corrupt every analyzer constructed thereafter. The threshold
surface is security-sensitive (governs how aggressively profile paths
collapse into wildcards), so accidental cross-caller leakage is a
real risk.

Replaces the exported var with an unexported `defaultCollapseConfig`
and an accessor `DefaultCollapseConfig()` that returns a value copy.
Pattern mirrors the existing `defaultCollapseConfigs` / `DefaultCollapseConfigs()`
defensive-copy accessor for the slice form.

Updates 6 callsites across the test suite to use the accessor.
Adds TestDefaultCollapseConfig_DefensiveCopy pinning the contract: a
caller mutating the returned struct does not affect subsequent reads.

* docs(dynamicpathdetector): comment + guard kubescape#323 nitpicks (#5, #6, #9, #10)

Four small docstring / defensive-guard improvements from CodeRabbit
upstream PR kubescape#323:

  #5  wildcardKey / getEndpointKey format coupling — extract
      buildEndpointKey as the single source of truth for the lookup
      key shape. Both call-sites now route through it, removing the
      "must stay in sync" risk that allowed past drift. New test
      TestBuildEndpointKey_SharedFormat pins end-to-end behaviour via
      MergeDuplicateEndpoints: a wildcard-port entry and a specific-
      port sibling collapse into one row with merged Methods.

  #6  consolidateOpens O(n²) — acknowledged in the function's
      docstring along with the threshold (≤ ~10k entries) above which
      a trie-based replacement would be worthwhile. Behaviour
      unchanged.

  #9  removeEndpoint in-place mutation — added a NOTE block in the
      docstring describing the backing-array shift, why the sole
      caller is safe today, and the swap-to-copy guidance for any
      future caller that stores intermediate slice references.

  #10 hasPrefixAtBoundary("") edge — added an explicit empty-prefix
      branch returning true. Without it the function fell through to
      pathPrefix[0] == '/', which is true for any absolute path —
      effectively treating "" as a root-matching prefix incidentally.
      The explicit branch makes the invariant load-bearing.
      New test TestHasPrefixAtBoundary_EmptyPrefix pins the surface
      via FindConfigForPath.

All existing tests still pass. Local run on this arm64 machine:
  go test ./pkg/registry/file/... ok in 0.93s

* deps: bump syft to v1.42.4 (CVE-2026-33481), stereoscope v0.1.22, runtime-spec v1.3.0

CodeRabbit upstream PR kubescape#326 finding #1 (Critical). Lifts syft from
v1.32.0 to v1.42.4 — the lowest version that includes the CVE-2026-33481
fix (Syft was failing to clean up temporary storage when a scan
errored mid-flight). v1.42.4 is the latest patch on the v1.42 line, so
the upgrade gets the CVE fix without taking the v1.43/v1.44 minor
bumps' larger surface-area changes.

go mod tidy pulled forward two transitive bumps:
  - stereoscope v0.1.9 → v0.1.22 (Anchore's syft-companion lib, must
    move in lockstep with syft)
  - runtime-spec v1.2.1 → v1.3.0 (transitive via stereoscope)
  - modernc.org/libc v1.67.6 → v1.70.0 (transitive)
  - modernc.org/sqlite v1.46.1 → v1.46.2 (transitive)

go.mod no longer carries a replace directive against kubescape/syft —
our fork doesn't have v1.42.3+ tags and the upstream content is
required for the CVE fix. Direct anchore/syft dependency is now the
source of truth.

Local validation:
  go build ./...          ok
  go test ./pkg/... ok in 1.4s (all 16 packages green)

Vendor directory is .gitignored; `go mod vendor` was rerun locally
to keep build/manifest in sync.

* fix: address 3 follow-up rabbit findings on PR #33

1. ci(manual-integration-tests): use job-level if: false instead of
   exit 1 — workflow shows as "skipped" not "failed red X" in the
   Actions UI when dispatched, removing the cosmetic noise while
   keeping the ~30-minute waste prevention. Issue #32 reference kept
   in the YAML comment above the conditional.

2. test(network_types_protobuf): also assert the Go struct's protobuf
   tag, not just the .proto-file text. The earlier test only checked
   one side of the schema/tag contract, so tag drift on the Go side
   would silently pass. Now both halves are pinned.

3. test(applicationprofile_processor): assert the FULL
   CollapseSettings struct (not just OpenDynamicThreshold). Without
   this, a regression that resets any field added to
   CollapseSettings in the future would silently pass the zero-value
   defensive-fallback test.

All from CodeRabbit follow-up review on storage PR #33. Local validation:
  go test ./pkg/registry/file/ ./pkg/apis/softwarecomposition/v1beta1/
  → both packages ok

* docs(networkmatch): correct spec URL to billofbehavior.com

---------

Co-authored-by: Entlein <eineintlein@gmail.com>

---------

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
Co-authored-by: Matthias Bertschy <matthias.bertschy@gmail.com>
Co-authored-by: Entlein <eineintlein@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant