Skip to content

Guard: detect stale endorsement reactions via timestamp comparison #4227

@lpcox

Description

@lpcox

Motivation

Maintainer reactions (❤️, 👍) on issues, PRs, and comments are used by the github-guard to boost an object's integrity level to approved. However, GitHub objects are mutable — the owner can edit the body after a maintainer has endorsed it. This creates a post-endorsement prompt injection vector:

  1. User posts an innocuous issue comment
  2. Maintainer reacts with ❤️ → guard elevates integrity to approved
  3. User edits the comment to include a prompt injection
  4. Agent reads the comment — guard still sees the reaction and grants approved integrity

The guard currently performs no staleness check — a reaction from months ago on content edited yesterday is treated identically to a fresh endorsement on unmodified content.

Proposed Solution: Timestamp-Based Staleness Detection

Compare the object's updated_at timestamp against the endorsing reaction's createdAt timestamp. If the object was modified after the endorsement, the endorsement is stale and should not boost integrity.

if item.updated_at > reaction.created_at:
    skip this reaction (content changed since endorsement)

This is the simplest approach that covers the main attack vector with zero additional infrastructure (no companion comments, no hash computation, no extra workflows).

Why timestamps and not content hashes?

  • Zero UX friction — maintainers keep using reactions as-is
  • No extra infrastructure — no bot comments, no workflows triggered on reactions
  • Data already availableupdated_at is on every GitHub object; reaction createdAt is available in GraphQL responses
  • No race conditions — the check happens at guard evaluation time, not at endorsement time

Content-hash-based endorsement (/endorse sha256:...) could be layered on later for high-integrity scenarios requiring cryptographic binding, but timestamps cover the primary threat model.

Edge cases

Scenario Behavior Correct?
Content edited after reaction Endorsement ignored ✅ Safe default
Benign edit (typo fix) after reaction Endorsement ignored ✅ Requires re-endorsement — acceptable
Content never edited Endorsement honored ✅ Normal case
updated_at == created_at (no edits) Endorsement honored
Reaction added after edit Endorsement honored (reaction.createdAt > updated_at) ✅ Maintainer saw final content

False positive: edit-then-revert

A sophisticated attacker could edit content, inject a payload that gets read by an agent, then revert the edit. The updated_at timestamp would still show the edit happened after the reaction, so the endorsement would be invalidated. However, if the agent already consumed the injected content before the revert, the damage is done. This is a narrow window and timestamps still correctly invalidate the endorsement for subsequent reads.

Implementation Plan

Phase 1: Core staleness check in has_maintainer_reaction_with_callback()

File: guards/github-guard/rust-guard/src/labels/helpers.rs
Function: has_maintainer_reaction_with_callback() (line ~471)

Inside the reaction iteration loop, after extracting the reaction node:

// Extract reaction createdAt and compare with item updated_at
if let (Some(reaction_created), Some(item_updated)) = (
    node.get("createdAt").and_then(|v| v.as_str()),
    item.get("updatedAt").and_then(|v| v.as_str()).or_else(|| item.get("updated_at").and_then(|v| v.as_str()))
) {
    if item_updated > reaction_created {
        // Content was modified after this endorsement — skip it
        continue;
    }
}

String comparison works for ISO 8601 timestamps (2026-04-21T00:00:00Z) since they sort lexicographically.

Phase 2: Ensure updated_at is present in item data

File: guards/github-guard/rust-guard/src/labels/response_items.rs

Verify that updated_at / updatedAt is extracted from GitHub API responses for issues, PRs, and comments. The field is already present in GitHub's REST and GraphQL responses but confirm the guard's GraphQL fragments include it.

Phase 3: Ensure reaction createdAt is in reaction nodes

File: The GraphQL query that fetches reactions (likely in the proxy or MCP server)

Verify that reaction nodes include createdAt. GitHub's GraphQL ReactionConnectionReactionEdgeReaction type includes createdAt by default when nodes are queried. If only content and user are being fetched, add createdAt to the fragment.

Phase 4: Logging and observability

When a reaction is skipped due to staleness, log it:

log_debug!("Skipping stale endorsement: reaction_created={}, item_updated={}, reactor={}", 
    reaction_created, item_updated, reactor_login);

This aids debugging without being noisy in production.

Phase 5: Tests

Add test cases to the existing endorsement test suite:

  1. Endorsement on unmodified item → integrity boosted ✅
  2. Endorsement on item edited after reaction → integrity NOT boosted ✅
  3. Endorsement added after last edit → integrity boosted ✅
  4. Multiple reactions, some stale, some fresh → only fresh ones count ✅
  5. Missing timestamps (graceful degradation) → current behavior (honor reaction) ✅

Files to modify

File Change
guards/github-guard/rust-guard/src/labels/helpers.rs Add staleness check in has_maintainer_reaction_with_callback() (~line 471)
guards/github-guard/rust-guard/src/labels/helpers.rs Add tests for staleness scenarios
GraphQL fragments (proxy/MCP server) Ensure createdAt on reaction nodes, updatedAt on items

Out of scope (future work)

  • Content-hash endorsement (/endorse sha256:...) — stronger guarantees but requires workflow infrastructure
  • Configurable staleness policy (e.g., endorsement-max-age-days) — useful but orthogonal to the edit-detection problem
  • Disapproval reaction staleness — same pattern but lower priority since disapproval already runs last and wins

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions