Skip to content

fix: [Security] fix task API ownership leakage#1182

Merged
qin-ctx merged 2 commits intovolcengine:mainfrom
13ernkastel:base-origin-main-security
Apr 3, 2026
Merged

fix: [Security] fix task API ownership leakage#1182
qin-ctx merged 2 commits intovolcengine:mainfrom
13ernkastel:base-origin-main-security

Conversation

@13ernkastel
Copy link
Copy Markdown
Contributor

@13ernkastel 13ernkastel commented Apr 2, 2026

Summary

  • require authentication for the task polling endpoints and scope task reads to the authenticated owner
  • persist task ownership metadata in the in-memory tracker without exposing it in API responses
  • scope async reindex task deduplication to the same owner so one tenant cannot block another tenant's background task on the same URI string

Root Cause

/api/v1/tasks and /api/v1/tasks/{task_id} were mounted without get_request_context, and TaskTracker stored tasks in a single global namespace with no owner metadata. Any caller who could reach the server could enumerate or poll background task records created by other users, and async reindex deduplication also keyed only on the raw resource URI.

Impact

  • unauthenticated callers could enumerate recent background tasks and retrieve per-task metadata
  • exposed metadata could include task type, task status, resource identifiers, archive URIs, result payloads, and sanitized error strings from other users' activity
  • in multi-tenant deployments, async reindex deduplication could also create low-grade cross-tenant interference when different users referenced the same URI string

Proof Of Concept

The issue is reproducible on the vulnerable code path before this patch:

  1. Create a background task as a normal authenticated user:
    API_KEY='<valid-user-key>'
    BASE='http://127.0.0.1:1933'
    
    SESSION_ID=$(curl -s -X POST "$BASE/api/v1/sessions" \
      -H "X-API-Key: $API_KEY" \
      -H 'Content-Type: application/json' \
      -d '{}' | jq -r '.result.session_id')
    
    curl -s -X POST "$BASE/api/v1/sessions/$SESSION_ID/messages" \
      -H "X-API-Key: $API_KEY" \
      -H 'Content-Type: application/json' \
      -d '{"role":"user","content":"hello"}' >/dev/null
    
    TASK_ID=$(curl -s -X POST "$BASE/api/v1/sessions/$SESSION_ID/commit" \
      -H "X-API-Key: $API_KEY" \
      -H 'Content-Type: application/json' \
      -d '{}' | jq -r '.result.task_id')
  2. From an unauthenticated client, enumerate tasks:
    curl -s "$BASE/api/v1/tasks" | jq
    Before the fix, this returns the other user's task metadata instead of 401.
  3. From the same unauthenticated client, retrieve the specific task:
    curl -s "$BASE/api/v1/tasks/$TASK_ID" | jq
    Before the fix, this returns the task record for another user's background job instead of 401/404.

Classification

  • Primary CWE: CWE-862 Missing Authorization
  • Secondary CWE: CWE-200 Exposure of Sensitive Information to an Unauthorized Actor
  • CVSS v3.1: 5.3 Medium (AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N)

Validation

  • uvx ruff format --check openviking/service/task_tracker.py openviking/server/routers/tasks.py openviking/server/routers/content.py openviking/session/session.py tests/test_task_tracker.py tests/server/test_auth.py tests/server/test_api_content.py tests/test_session_task_tracking.py
  • uvx ruff check openviking/service/task_tracker.py openviking/server/routers/tasks.py openviking/server/routers/content.py openviking/session/session.py tests/test_task_tracker.py tests/server/test_auth.py tests/server/test_api_content.py tests/test_session_task_tracking.py
  • OPENVIKING_CONFIG_FILE=/tmp/openviking-security-test-ov.conf PYTHONPATH=/Users/lennonchia/Documents/Project/OpenViking-security /Users/lennonchia/Documents/Project/OpenViking/.venv/bin/python -m pytest tests/test_task_tracker.py tests/server/test_auth.py::test_task_endpoints_require_auth tests/server/test_auth.py::test_task_endpoints_are_user_scoped tests/server/test_api_content.py::test_reindex_uses_request_tenant_for_exists -q

Notes

  • I checked for an existing upstream/internal duplicate around task endpoint auth and task ownership leakage before landing this branch and did not find one.
  • This branch is based on upstream main plus this single security fix; it does not include unrelated fork-only changes.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Failed to generate code suggestions for PR

@13ernkastel 13ernkastel changed the title [codex] fix task API ownership leakage [Security] fix task API ownership leakage Apr 2, 2026
@13ernkastel 13ernkastel marked this pull request as ready for review April 2, 2026 11:55
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Failed to generate code suggestions for PR

@13ernkastel 13ernkastel changed the title [Security] fix task API ownership leakage fix: [Security] fix task API ownership leakage Apr 2, 2026
@qin-ctx qin-ctx self-assigned this Apr 3, 2026
@qin-ctx qin-ctx self-requested a review April 3, 2026 03:10
@qin-ctx qin-ctx merged commit 8c1c3f3 into volcengine:main Apr 3, 2026
12 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Apr 3, 2026
zeattacker pushed a commit to zeattacker/OpenViking that referenced this pull request Apr 10, 2026
Upstream catch-up from volcengine/OpenViking main at a18d4b9.

Auto-merged (24 files): README, bot/config, crates/ov_cli, session/session.py,
session/compressor{,_v2}.py, session/memory_deduplicator.py, memory_extractor.py,
schema_model_generator.py, utils/uri.py, storage/viking_fs.py,
storage/transaction/lock_manager.py, storage/queuefs/semantic_processor.py,
retrieve/hierarchical_retriever.py, server/routers/content.py, server/routers/search.py,
core/directories.py, prompts/templates/memory/preferences.yaml,
examples/openclaw-plugin/{index,text-utils}.ts, examples/ov.conf.example,
openviking_cli/utils/config/{memory_config,open_viking_config}.py,
docs/en/concepts/08-session.md.

Hand-resolved conflicts (14 files):

Memory subsystem:
- entities.yaml: adopt upstream category field + event linking, keep our
  brain-hardening rules (canonical names, Aliases section, one-card-per-entity).
  Removed upstream's Caroline LoCoMo leak.
- events.yaml: adopt upstream year/month/day folder template.
- memory_updater.py: keep dev's two-function _apply_write/_apply_edit structure
  with collision-safe entity writes, port get_year/get_month/get_day helpers
  to ExtractContext for events.yaml template to work.
- session_extract_context_provider.py: keep dev's [Keywords] bug fix
  (_derive_search_keywords) — upstream volcengine#1159 did not address this.
- extract_loop.py: keep dev's small-model extraction path. Dev's hard_cap
  iteration extension is more sophisticated than upstream's version.

Storage:
- collection_schemas.py: combine dev's BGE-M3 truncation (30k chars) with
  upstream volcengine#1301 embed_compat async wrapper.
- queuefs/semantic_dag.py: combine dev's old_summary hash comparison with
  upstream's defensive null check on cached summary.
- queuefs/semantic_queue.py: keep dev's 300-second dedup window with
  _TrackedSemanticRequest dataclass.
- utils/summarizer.py: take upstream — convergent fix, upstream is superset
  of our context_type centralization plus resource cover handling.

Models:
- openai_vlm.py: merge timeout signature (float | None = 60.0).

Plugin (TypeScript):
- config.ts: keep dev's profileInjection/recallFormat/alignment fields,
  add upstream's bypassSessionPatterns deprecation alias.
- client.ts: merge addSessionMessage signature
  (sessionId, role, content, parts?, agentId?, createdAt?), keep dev's
  createSession(), combine body building with optional created_at.
- context-engine.ts: keep dev's driftDetector + alignment, add upstream's
  isBypassedSession helper + bypass early-return in doCommit/afterTurn.
  Drop upstream's inline afterTurn commit block — dev routes through
  doCommitOVSession. Update addSessionMessage call to 6-arg form.
- memory-ranking.ts: keep dev's multi-slot recall + tool-experience
  separation architecture.

Deferred upstream changes:
- volcengine#1221 agfs→ragfs Rust rewrite: new Rust crates land but Python code
  unchanged; pyagfs handles auto-fallback via RAGFS_IMPL env var.
- volcengine#1159 memory_updater unified operations refactor: kept dev's separate
  write/edit function structure to avoid cascading schema changes through
  extract_loop and related files.

Critical security + bug fixes preserved from upstream:
- volcengine#1319 leaked token removed from settings.py
- volcengine#1133 SSRF hardening on HTTP resource ingestion
- volcengine#1297 sanitize and cap recall queries (wraps our plugin recall)
- volcengine#1277 configurable embedding circuit breaker
- volcengine#1279 trusted mode without API key restricted to localhost
- volcengine#1211 PID lock recycle + ownership checks + compressor refs
- volcengine#1182 task API ownership leakage fix in content.py
- volcengine#1301 embedder async contention reduction
- volcengine#1226 prevent VLM blocking startup hang in redo recovery
- volcengine#769/volcengine#792 queuefs dedupe memory semantic parent enqueues

Critical dev fixes preserved:
- [Keywords] placeholder root-cause fix (brain-hardening plan phase 1)
- Episodic memory v2 (retrieval scope, category boosts, archive filter)
- Qwen-9B small-model extraction compatibility
- Plugin recall refactor (tool-experience separation, multi-tier recall)
- Context-type centralized inference (convergent with upstream fix)
- BGE-M3 embedding input truncation
- URI normalization + collision-safe entity writes

Merged on dev-local2 safety branch. Next steps: build + test before
fast-forwarding dev.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants