feat: API key auth and identity for processing services#1194
feat: API key auth and identity for processing services#1194
Conversation
✅ Deploy Preview for antenna-preview canceled.
|
✅ Deploy Preview for antenna-ssec canceled.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds ProcessingService API key auth, client-info capture and persistence, converts /jobs/{id}/tasks from GET→POST with a TasksRequestSerializer, extends ProcessingService (model, serializer, admin), adds key generation endpoints and frontend UI, and includes worker registration script and DB migration. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Processing Service
participant Auth as APIKey Auth
participant API as DRF API
participant Model as ProcessingService Model
participant DB as Database
Client->>API: POST /api/v2/jobs/{id}/tasks\nAuthorization: Api-Key ant_ps_<key>\n{"batch":N,"client_info":{...}}
API->>Auth: authenticate(request)
Auth->>DB: lookup key by prefix
DB-->>Auth: key record
Auth-->>API: (AnonymousUser, ProcessingService)
API->>Model: _update_processing_service_heartbeat(request.auth, client_info)
Model->>DB: save last_seen_client_info + timestamps
API->>DB: reserve tasks
DB-->>API: queued tasks
API-->>Client: 200 + tasks payload
sequenceDiagram
participant Admin as Admin UI
participant Hook as useGenerateAPIKey
participant API as REST API
participant View as ProcessingServiceViewSet
participant DB as Database
Admin->>Hook: click Generate API Key
Hook->>API: POST /processing-services/{id}/generate_key/
API->>View: generate_key handler
View->>DB: revoke non-revoked keys
View->>DB: create new API key (prefix + hashed)
DB-->>View: new key created
View-->>API: return plaintext api_key
API-->>Hook: response with api_key
Hook-->>Admin: display key once
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (11)
processing_services/minimal/start.sh (2)
4-14: Consider signal forwarding for graceful shutdown.When the container receives
SIGTERM, thewaitcommand is interrupted but the background server process doesn't receive the signal. This can prevent graceful shutdown of the FastAPI server.Optional: trap and forward signals
#!/bin/bash set -e +# Forward signals to child process +trap 'kill -TERM $SERVER_PID 2>/dev/null' TERM INT + # Start FastAPI server in background python /app/main.py & SERVER_PID=$!🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processing_services/minimal/start.sh` around lines 4 - 14, The script starts the FastAPI server in background (SERVER_PID) and uses wait, but doesn't forward container signals to the child; add a trap to catch SIGTERM and SIGINT (and optionally SIGHUP), and forward them to the background process (kill -TERM $SERVER_PID) so the FastAPI process can shutdown gracefully, then re-wait for SERVER_PID and exit with its status; update the start.sh logic around SERVER_PID, the wait call, and add a trap handler to ensure proper signal forwarding and cleanup.
9-11: Registration failure will stop the container.With
set -e, ifregister.pyexits with a non-zero status (which it does after exhausting retries), the entire container will terminate—even though the FastAPI server may be running fine.If you want the server to continue running despite registration failures (falling back to "push-mode" behavior), prevent the registration exit code from being fatal:
Proposed fix: allow registration to fail non-fatally
# Run registration if API key is configured if [ -n "$ANTENNA_API_KEY" ]; then - python /app/register.py + python /app/register.py || echo "Registration failed, continuing in push-mode" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processing_services/minimal/start.sh` around lines 9 - 11, The container is exiting when register.py returns a non-zero status because the script is running under set -e; change the registration invocation to not propagate failures by invoking python /app/register.py in a way that ignores its exit code (for example: python /app/register.py || true or by temporarily disabling set -e around the call), so when ANTENNA_API_KEY is set the script still runs but a failing register.py will not stop the container or the FastAPI server; keep the existing ANTENNA_API_KEY conditional and ensure any error is at least logged (e.g., echo) so failures remain visible.ami/jobs/schemas.py (1)
28-30: LGTM!The
TasksRequestSerializercorrectly validates the POST body for the/tasksendpoint refactor. Thebatchfield withmin_value=1prevents invalid requests, andclient_infobeing optional maintains backwards compatibility.Optional consideration: You may want to add a
max_valueconstraint onbatchto prevent excessively large task requests that could strain resources.,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/jobs/schemas.py` around lines 28 - 30, TasksRequestSerializer currently enforces min_value=1 for the batch IntegerField but has no upper bound; add a max_value constraint on the batch field (in the TasksRequestSerializer definition) to cap how many tasks can be requested at once (e.g., a sensible constant like 1000 or a configurable setting), so update the batch field declaration to include max_value and adjust any related tests or docs that assume unlimited batch sizes.ami/ml/models/processing_service.py (1)
351-351: Use a throwaway variable for the unused plaintext key value.This avoids lint noise and makes intent explicit.
🧹 Suggested cleanup
- api_key_obj, plaintext_key = ProcessingServiceAPIKey.objects.create_key( + api_key_obj, _plaintext_key = ProcessingServiceAPIKey.objects.create_key( name=f"{name} key", processing_service=service, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/models/processing_service.py` at line 351, The call to ProcessingServiceAPIKey.objects.create_key(...) currently assigns the returned plaintext key to plaintext_key which is unused; replace that local variable with a throwaway name (e.g., _plaintext_key) so intent is explicit and linters stop flagging it—leave api_key_obj as-is and only change the plaintext_key binding.ami/ml/views.py (1)
321-325: Consider moving the import to module level.The inline import of
get_client_infoinside the method works but is unconventional. Module-level imports improve readability and allow import errors to surface at startup.💡 Proposed change
Move the import to the top of the file with other imports:
from ami.ml.models.api_key import ProcessingServiceAPIKey +from ami.ml.serializers_client_info import get_client_info from ami.ml.schemas import PipelineRegistrationResponseThen remove the inline import at line 322.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/views.py` around lines 321 - 325, Move the inline import of get_client_info out of the method and into the module-level imports at the top of the file (alongside the other imports), then remove the inline `from ami.ml.serializers_client_info import get_client_info` inside the view; ensure any references to `get_client_info(request)` in the method (and to `processing_service.last_seen_client_info` / `processing_service.mark_seen`) still resolve and run linters/tests to catch import-order or unused-import warnings.ami/jobs/views.py (1)
349-356: Consider narrowing the exception catch.The bare
except Exceptionis flagged by static analysis. While a catch-all may be intentional here to prevent unhandled errors from reaching clients, consider catching more specific exceptions or at least logging the exception type.💡 Proposed refinement
except pydantic.ValidationError as e: raise ValidationError(f"Invalid result data: {e}") from e except Exception as e: - logger.error("Failed to queue pipeline results for job %s: %s", job.pk, e) + logger.exception("Failed to queue pipeline results for job %s", job.pk) return Response( { "status": "error", "job_id": job.pk, }, status=500, )Using
logger.exceptionincludes the full traceback which aids debugging.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/jobs/views.py` around lines 349 - 356, Replace the bare "except Exception as e:" block in the pipeline queuing code (the block referencing logger.error with job.pk and the Response) with a narrower catch or enhanced logging: either catch the specific exceptions that can be raised by the queuing functions (e.g., the function/method that enqueues results) or, if a broad catch is required, use logger.exception (or logger.error(..., exc_info=True)) to include the full traceback and exception type, keep returning the same Response payload, and ensure you still reference job.pk in the log call for context.docs/superpowers/plans/2026-03-27-processing-service-api-keys.md (1)
14-20: Documentation may not reflect final implementation.The Architecture section describes a custom
api_keyfield on ProcessingService generated viasecrets.token_urlsafe(36), but the actual implementation uses thedjangorestframework-api-keylibrary with a separateProcessingServiceAPIKeymodel.Consider updating this section to reflect the library-backed approach, or note that this was the initial design that evolved during implementation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/plans/2026-03-27-processing-service-api-keys.md` around lines 14 - 20, Update the "Authentication Model" docs to reflect the implemented approach: replace the statement that ProcessingService has a custom `api_key` generated via `secrets.token_urlsafe(36)` with a note that we use the `djangorestframework-api-key` library and a separate `ProcessingServiceAPIKey` model, and clarify that the custom DRF auth backend `ProcessingServiceAPIKeyAuthentication` still reads `Authorization: Bearer ...` and places the PS identity on `request.auth` (not `request.user`); alternatively, add a short note that the original design used a built-in `api_key` field but the implementation evolved to the `ProcessingServiceAPIKey` model via the library.ami/ml/serializers_client_info.py (1)
30-35: Silent validation failure may hinder debugging.When
ClientInfoSerializervalidation fails, the function silently returns an empty dict (plus server fields). This could mask malformedclient_infopayloads from workers, making issues harder to diagnose.Consider logging validation errors at DEBUG or WARNING level:
💡 Proposed improvement
serializer = ClientInfoSerializer(data=raw) if serializer.is_valid(): info = serializer.validated_data else: + import logging + logger = logging.getLogger(__name__) + logger.debug("Invalid client_info ignored: %s", serializer.errors) info = {}(Or move the logger to module level if preferred.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/serializers_client_info.py` around lines 30 - 35, The code silently swallows ClientInfoSerializer validation failures (the block around request.data.get("client_info"), ClientInfoSerializer, serializer.is_valid(), and info), which makes malformed payloads hard to debug; change the branch where serializer.is_valid() is False to log serializer.errors (at DEBUG or WARNING) and include context (e.g., the raw payload) via the module-level logger so failures are visible in logs rather than returning an empty dict silently.ui/src/pages/project/processing-services/processing-services-actions.tsx (1)
51-57: Unhandled promise and potential state update after unmount.Two minor issues in
handleCopy:
navigator.clipboard.writeTextcan reject (e.g., in non-HTTPS contexts or if permissions are denied), but the error isn't caught.- The
setTimeoutcallback may fire after the component unmounts, causing a React warning.💡 Proposed fix
+import { useEffect, useRef, useState } from 'react' ... export const GenerateAPIKey = ({ processingService, }: { processingService: ProcessingService }) => { const { generateAPIKey, isLoading, error, apiKey } = useGenerateAPIKey() const [copied, setCopied] = useState(false) + const timeoutRef = useRef<NodeJS.Timeout | null>(null) + + useEffect(() => { + return () => { + if (timeoutRef.current) clearTimeout(timeoutRef.current) + } + }, []) const handleCopy = async () => { if (apiKey) { - await navigator.clipboard.writeText(apiKey) - setCopied(true) - setTimeout(() => setCopied(false), 2000) + try { + await navigator.clipboard.writeText(apiKey) + setCopied(true) + timeoutRef.current = setTimeout(() => setCopied(false), 2000) + } catch { + // Clipboard API not available or permission denied + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/pages/project/processing-services/processing-services-actions.tsx` around lines 51 - 57, The handleCopy function currently awaits navigator.clipboard.writeText without catching rejections and uses setTimeout which can call setCopied after unmount; wrap the clipboard write in a try/catch inside handleCopy (handle or log the error and avoid throwing) and make the timeout safe by storing its ID and clearing it on unmount or by guarding the callback with an isMounted ref; specifically update the handleCopy function and the component that owns setCopied to catch errors from navigator.clipboard.writeText and to clear the timeout (or check isMounted) before calling setCopied(false).ami/ml/serializers.py (1)
177-179: Potential N+1 queries with filter() on prefetched relation.The
filter(revoked=False)call onobj.api_keysbypasses theprefetch_related("api_keys")cache from the ViewSet, executing a new database query for each serialized object. Django's prefetch cache only applies to direct.all()access; filtering creates a new query.💡 Option 1: Filter in Python from prefetched data
def get_api_key_prefix(self, obj): - latest_key = obj.api_keys.filter(revoked=False).order_by("-created").first() - return latest_key.prefix if latest_key else None + non_revoked = [k for k in obj.api_keys.all() if not k.revoked] + if non_revoked: + latest_key = max(non_revoked, key=lambda k: k.created) + return latest_key.prefix + return None💡 Option 2: Use Prefetch with filtered queryset in ViewSet
from django.db.models import Prefetch from ami.ml.models.api_key import ProcessingServiceAPIKey queryset = ProcessingService.objects.all().prefetch_related( Prefetch( "api_keys", queryset=ProcessingServiceAPIKey.objects.filter(revoked=False).order_by("-created"), to_attr="active_api_keys" ) )Then in serializer:
def get_api_key_prefix(self, obj): keys = getattr(obj, "active_api_keys", []) return keys[0].prefix if keys else None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/serializers.py` around lines 177 - 179, The serializer's get_api_key_prefix method calls obj.api_keys.filter(...), which bypasses any prefetch_related cache and causes N+1 queries; instead either (A) iterate/filter the already-prefetched obj.api_keys in Python inside get_api_key_prefix (e.g., pick the non-revoked key with the latest created value from obj.api_keys) or (B) adjust the ViewSet queryset to prefetch a filtered relation using django.db.models.Prefetch with ProcessingServiceAPIKey.objects.filter(revoked=False).order_by("-created") into a to_attr like active_api_keys and then update get_api_key_prefix to read the first entry from getattr(obj, "active_api_keys", []) to return its prefix or None.ami/ml/tests.py (1)
1487-1490: Prefix unused variable with underscore.The
new_objvariable is never used. Following Python convention, prefix it with underscore to indicate it's intentionally unused.🧹 Suggested fix
- new_obj, new_key = ProcessingServiceAPIKey.objects.create_key( + _, new_key = ProcessingServiceAPIKey.objects.create_key( name="key-2", processing_service=ps, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/tests.py` around lines 1487 - 1490, The variable new_obj returned from ProcessingServiceAPIKey.objects.create_key(...) is unused; rename it to _new_obj to follow Python convention for intentionally unused variables. Update the call site where new_obj and new_key are assigned (the ProcessingServiceAPIKey.objects.create_key invocation) to use _new_obj, leaving new_key unchanged so subsequent logic continues to use new_key.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ami/base/permissions.py`:
- Around line 232-240: The permission check currently permits any authenticated
processing service when view lacks get_active_project; change has_permission to
fail closed by returning False if getattr(view, "get_active_project", None) is
missing, and also treat a missing project instance as denial (i.e., call
get_active_project(), and if project is falsy or
request.auth.projects.filter(pk=project.pk).exists() is False, return False);
only return True when get_active_project exists, returns a valid project, and
that project is linked to request.auth.projects.
In `@ami/ml/migrations/0029_processingserviceapikey.py`:
- Around line 23-29: The migration defines the ProcessingServiceAPIKey model's
"name" CharField with default=None but no null=True, which causes Django to
store the string "None"; update the field definition for "name" in the
0029_processingserviceapikey migration to use a safe default and allow empty
input — e.g., change to default="" and add blank=True on the CharField (or, if
you need to allow nulls, remove the default and set blank=True, null=True) so
the database won't receive the literal "None" string.
In `@ami/ml/migrations/0030_remove_custom_api_key_fields.py`:
- Around line 1-24: Add a data migration to preserve existing API keys before
removing fields: create a RunPython operation (e.g., function migrate_api_keys)
that uses apps.get_model to load ProcessingService and ProcessingServiceAPIKey,
iterates ProcessingService objects with non-null api_key, and creates
corresponding ProcessingServiceAPIKey entries populating processing_service,
prefix (use api_key_prefix or first 8 chars), hashed_key (from api_key), and
created (use api_key_created_at or timezone.now()); add this RunPython to the
migration sequence (either in 0029 or as a new migration before 0030) and use
migrations.RunPython.noop as the reverse operation.
In `@docs/claude/prompts/NEXT_SESSION_PROMPT.md`:
- Line 25: Update the outdated permission class name in the prompt: find the
occurrence of "HasProcessingServiceKey" (e.g., the line stating
"`HasProcessingServiceKey` permission class checks project membership via PS's
`projects` M2M") and change it to "HasProcessingServiceAPIKey" so the prompt
matches the implemented backend class; also scan the same document for any other
occurrences of HasProcessingServiceKey and replace them to keep names
consistent.
In `@processing_services/minimal/register.py`:
- Around line 91-101: The retry loop around
requests.get("http://localhost:2000/livez", timeout=2) only catches
requests.ConnectionError so timeouts will crash; update the exception handling
in the for attempt in range(MAX_RETRIES) loop to also catch
requests.exceptions.Timeout (or broader requests.RequestException) and treat it
the same as ConnectionError so the code logs the retry message, sleeps using
RETRY_DELAY, and exits only after MAX_RETRIES; reference the requests.get call,
the for attempt in range(MAX_RETRIES) loop, MAX_RETRIES, RETRY_DELAY, and logger
when making the change.
- Around line 108-117: The retry loop around register_with_antenna currently
only catches requests.ConnectionError but register_with_antenna uses timeout=30
and can raise requests.Timeout; update the except clause in the loop that calls
register_with_antenna(api_url, api_key, project_id, pipelines, client_info) to
catch both requests.ConnectionError and requests.Timeout (e.g., except
(requests.ConnectionError, requests.Timeout):) so timeouts are retried the same
way as connection errors; ensure requests.Timeout is referenced from the
requests module used elsewhere in this file.
- Around line 103-105: The call to get_own_pipeline_configs() is unguarded and
can raise if the local server is unavailable; wrap the call to
get_own_pipeline_configs() (and ideally the follow-up get_client_info() usage)
in a try/except block that catches Exception, logs an informative error
including the caught exception (e.g., using the module's logger or process
logger) and then exits with a non-zero status (sys.exit(1)) or re-raises a
RuntimeError with context; ensure the log message names
get_own_pipeline_configs() so it's clear where the failure occurred.
In
`@ui/src/pages/processing-service-details/processing-service-details-dialog.tsx`:
- Around line 97-137: The new hardcoded UI strings in the Processing Service
dialog need to be localized; replace literal strings like "Authentication", "API
Key Prefix", "Mode", "Last Known Worker", "Hostname", "Software", "Platform",
and "Remote Address" with translate(...) calls and corresponding i18n keys.
Update the JSX that renders FormSection(title=...) and InputValue(label=...,
value=...) (and the GenerateAPIKey usage if it renders text) to use
translate('processingService.dialog.<key>') keys, add matching entries to the
locale resource files, and ensure the value fallbacks (e.g. 'No key generated',
'Pull (async)', 'Push (sync)') are also localized via translate(...) calls
referencing new keys.
---
Nitpick comments:
In `@ami/jobs/schemas.py`:
- Around line 28-30: TasksRequestSerializer currently enforces min_value=1 for
the batch IntegerField but has no upper bound; add a max_value constraint on the
batch field (in the TasksRequestSerializer definition) to cap how many tasks can
be requested at once (e.g., a sensible constant like 1000 or a configurable
setting), so update the batch field declaration to include max_value and adjust
any related tests or docs that assume unlimited batch sizes.
In `@ami/jobs/views.py`:
- Around line 349-356: Replace the bare "except Exception as e:" block in the
pipeline queuing code (the block referencing logger.error with job.pk and the
Response) with a narrower catch or enhanced logging: either catch the specific
exceptions that can be raised by the queuing functions (e.g., the
function/method that enqueues results) or, if a broad catch is required, use
logger.exception (or logger.error(..., exc_info=True)) to include the full
traceback and exception type, keep returning the same Response payload, and
ensure you still reference job.pk in the log call for context.
In `@ami/ml/models/processing_service.py`:
- Line 351: The call to ProcessingServiceAPIKey.objects.create_key(...)
currently assigns the returned plaintext key to plaintext_key which is unused;
replace that local variable with a throwaway name (e.g., _plaintext_key) so
intent is explicit and linters stop flagging it—leave api_key_obj as-is and only
change the plaintext_key binding.
In `@ami/ml/serializers_client_info.py`:
- Around line 30-35: The code silently swallows ClientInfoSerializer validation
failures (the block around request.data.get("client_info"),
ClientInfoSerializer, serializer.is_valid(), and info), which makes malformed
payloads hard to debug; change the branch where serializer.is_valid() is False
to log serializer.errors (at DEBUG or WARNING) and include context (e.g., the
raw payload) via the module-level logger so failures are visible in logs rather
than returning an empty dict silently.
In `@ami/ml/serializers.py`:
- Around line 177-179: The serializer's get_api_key_prefix method calls
obj.api_keys.filter(...), which bypasses any prefetch_related cache and causes
N+1 queries; instead either (A) iterate/filter the already-prefetched
obj.api_keys in Python inside get_api_key_prefix (e.g., pick the non-revoked key
with the latest created value from obj.api_keys) or (B) adjust the ViewSet
queryset to prefetch a filtered relation using django.db.models.Prefetch with
ProcessingServiceAPIKey.objects.filter(revoked=False).order_by("-created") into
a to_attr like active_api_keys and then update get_api_key_prefix to read the
first entry from getattr(obj, "active_api_keys", []) to return its prefix or
None.
In `@ami/ml/tests.py`:
- Around line 1487-1490: The variable new_obj returned from
ProcessingServiceAPIKey.objects.create_key(...) is unused; rename it to _new_obj
to follow Python convention for intentionally unused variables. Update the call
site where new_obj and new_key are assigned (the
ProcessingServiceAPIKey.objects.create_key invocation) to use _new_obj, leaving
new_key unchanged so subsequent logic continues to use new_key.
In `@ami/ml/views.py`:
- Around line 321-325: Move the inline import of get_client_info out of the
method and into the module-level imports at the top of the file (alongside the
other imports), then remove the inline `from ami.ml.serializers_client_info
import get_client_info` inside the view; ensure any references to
`get_client_info(request)` in the method (and to
`processing_service.last_seen_client_info` / `processing_service.mark_seen`)
still resolve and run linters/tests to catch import-order or unused-import
warnings.
In `@docs/superpowers/plans/2026-03-27-processing-service-api-keys.md`:
- Around line 14-20: Update the "Authentication Model" docs to reflect the
implemented approach: replace the statement that ProcessingService has a custom
`api_key` generated via `secrets.token_urlsafe(36)` with a note that we use the
`djangorestframework-api-key` library and a separate `ProcessingServiceAPIKey`
model, and clarify that the custom DRF auth backend
`ProcessingServiceAPIKeyAuthentication` still reads `Authorization: Bearer ...`
and places the PS identity on `request.auth` (not `request.user`);
alternatively, add a short note that the original design used a built-in
`api_key` field but the implementation evolved to the `ProcessingServiceAPIKey`
model via the library.
In `@processing_services/minimal/start.sh`:
- Around line 4-14: The script starts the FastAPI server in background
(SERVER_PID) and uses wait, but doesn't forward container signals to the child;
add a trap to catch SIGTERM and SIGINT (and optionally SIGHUP), and forward them
to the background process (kill -TERM $SERVER_PID) so the FastAPI process can
shutdown gracefully, then re-wait for SERVER_PID and exit with its status;
update the start.sh logic around SERVER_PID, the wait call, and add a trap
handler to ensure proper signal forwarding and cleanup.
- Around line 9-11: The container is exiting when register.py returns a non-zero
status because the script is running under set -e; change the registration
invocation to not propagate failures by invoking python /app/register.py in a
way that ignores its exit code (for example: python /app/register.py || true or
by temporarily disabling set -e around the call), so when ANTENNA_API_KEY is set
the script still runs but a failing register.py will not stop the container or
the FastAPI server; keep the existing ANTENNA_API_KEY conditional and ensure any
error is at least logged (e.g., echo) so failures remain visible.
In `@ui/src/pages/project/processing-services/processing-services-actions.tsx`:
- Around line 51-57: The handleCopy function currently awaits
navigator.clipboard.writeText without catching rejections and uses setTimeout
which can call setCopied after unmount; wrap the clipboard write in a try/catch
inside handleCopy (handle or log the error and avoid throwing) and make the
timeout safe by storing its ID and clearing it on unmount or by guarding the
callback with an isMounted ref; specifically update the handleCopy function and
the component that owns setCopied to catch errors from
navigator.clipboard.writeText and to clear the timeout (or check isMounted)
before calling setCopied(false).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 088f5872-0dee-44c6-842c-d49efda18428
📒 Files selected for processing (28)
ami/base/permissions.pyami/jobs/schemas.pyami/jobs/tests/test_jobs.pyami/jobs/views.pyami/ml/auth.pyami/ml/migrations/0028_add_api_key_fields.pyami/ml/migrations/0029_processingserviceapikey.pyami/ml/migrations/0030_remove_custom_api_key_fields.pyami/ml/models/__init__.pyami/ml/models/api_key.pyami/ml/models/processing_service.pyami/ml/serializers.pyami/ml/serializers_client_info.pyami/ml/tests.pyami/ml/views.pyconfig/settings/base.pydocs/claude/prompts/NEXT_SESSION_PROMPT.mddocs/superpowers/plans/2026-03-27-processing-service-api-keys.mdprocessing_services/minimal/Dockerfileprocessing_services/minimal/register.pyprocessing_services/minimal/start.shrequirements/base.txtui/src/data-services/hooks/processing-services/useGenerateAPIKey.tsui/src/data-services/models/processing-service.tsui/src/pages/processing-service-details/processing-service-details-dialog.tsxui/src/pages/project/entities/details-form/processing-service-details-form.tsxui/src/pages/project/processing-services/processing-services-actions.tsxui/src/pages/project/processing-services/processing-services-columns.tsx
ui/src/pages/processing-service-details/processing-service-details-dialog.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Adds API key authentication and per-worker identity tracking for processing services (ML workers), while refactoring worker-facing job endpoints (/tasks and /result) toward safer HTTP semantics and structured request validation; also includes initial UI support for generating/displaying processing-service API keys.
Changes:
- Introduces
ProcessingServiceAPIKey+ DRF authentication/permission plumbing (identity onrequest.auth) and agenerate_keyaction for key rotation. - Adds
client_infoingestion (ClientInfoSerializer+get_client_info) and persistslast_seen_client_infoalongside heartbeat updates on authenticated worker requests. - Refactors worker endpoints (
/tasksGET→POST,/resultwrapped payload support) and adds a minimal processing service registration flow plus UI fields/actions to surface prefixes and worker metadata.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/pages/project/processing-services/processing-services-columns.tsx | Import formatting cleanup to support new action exports. |
| ui/src/pages/project/processing-services/processing-services-actions.tsx | Adds GenerateAPIKey UI action component alongside existing populate action. |
| ui/src/pages/project/entities/details-form/processing-service-details-form.tsx | Makes endpoint_url optional in the form copy (pull-mode compatibility). |
| ui/src/pages/processing-service-details/processing-service-details-dialog.tsx | Displays API key prefix/mode and last worker metadata; wires in Generate API Key UI. |
| ui/src/data-services/models/processing-service.ts | Extends ProcessingService model with API key prefix + last seen client info accessors; updates async status display logic. |
| ui/src/data-services/hooks/processing-services/useGenerateAPIKey.ts | New React Query mutation hook for generate_key endpoint. |
| requirements/base.txt | Adds djangorestframework-api-key dependency. |
| processing_services/minimal/start.sh | New startup wrapper to run server and optionally self-register pipelines using API key auth. |
| processing_services/minimal/register.py | New registration client that posts pipelines + client_info with Authorization: Api-Key .... |
| processing_services/minimal/Dockerfile | Switches container command to the new startup script. |
| docs/superpowers/plans/2026-03-27-processing-service-api-keys.md | Adds a design/plan doc for the feature and endpoint refactor. |
| docs/claude/prompts/NEXT_SESSION_PROMPT.md | Adds a follow-up prompt doc for continuing frontend/cleanup work. |
| config/settings/base.py | Registers rest_framework_api_key app and adds the PS API key auth backend to DRF settings. |
| ami/ml/views.py | Adds generate_key action; enables dual-auth pipeline registration; prefetches API keys for PS endpoints. |
| ami/ml/tests.py | Adds backend tests for PS auth, key lifecycle, serializers, client_info, and an end-to-end flow. |
| ami/ml/serializers_client_info.py | New serializer/helper to validate/assemble client_info (hostname/software/version/platform/etc). |
| ami/ml/serializers.py | Exposes api_key_prefix and last_seen_client_info; adds client_info and optional name to pipeline registration serializer. |
| ami/ml/models/processing_service.py | Adds last_seen_client_info, persists it in mark_seen(), and adds optional API key generation for default PS creation. |
| ami/ml/models/api_key.py | New ProcessingServiceAPIKey model based on AbstractAPIKey. |
| ami/ml/models/init.py | Exports the new API key model. |
| ami/ml/migrations/0028_add_api_key_fields.py | Adds initial API-key-related fields + last_seen_client_info to ProcessingService (later superseded). |
| ami/ml/migrations/0029_processingserviceapikey.py | Adds the new API key model table. |
| ami/ml/migrations/0030_remove_custom_api_key_fields.py | Removes the initial custom API key fields from ProcessingService (migrating to model-based keys). |
| ami/ml/auth.py | New DRF authentication backend for Authorization: Api-Key ... to set PS identity on request.auth. |
| ami/jobs/views.py | Refactors /tasks to POST with serializer; supports wrapped /result; updates per-PS heartbeat and permissions. |
| ami/jobs/tests/test_jobs.py | Updates job tests to use POST /tasks with request body. |
| ami/jobs/schemas.py | Adds TasksRequestSerializer (batch + optional client_info). |
| ami/base/permissions.py | Adds HasProcessingServiceAPIKey permission for project-scoped PS access and composition with ObjectPermission. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docs/superpowers/plans/2026-03-27-processing-service-api-keys.md
Outdated
Show resolved
Hide resolved
requirements/base.txt
Outdated
| django-redis==5.3.0 # https://github.com/jazzband/django-redis | ||
| # Django REST Framework | ||
| djangorestframework==3.14.0 # https://github.com/encode/django-rest-framework | ||
| djangorestframework-api-key==3.* # https://github.com/florimondmanca/djangorestframework-api-key |
There was a problem hiding this comment.
djangorestframework-api-key==3.* is the only wildcard pin in this requirements file; it makes builds less reproducible because it can resolve to different patch versions over time. Prefer pinning to an explicit version (or using a compatible-range specifier) to keep dependency resolution deterministic.
| djangorestframework-api-key==3.* # https://github.com/florimondmanca/djangorestframework-api-key | |
| djangorestframework-api-key==3.0.0 # https://github.com/florimondmanca/djangorestframework-api-key |
There was a problem hiding this comment.
Claude says: Fair point. Will pin to an exact version in a follow-up.
ami/ml/models/processing_service.py
Outdated
| and logs the plaintext key. This is used for docker compose setups where the ml_backend | ||
| needs an API key to self-register. |
There was a problem hiding this comment.
The docstring says generate_api_key=True "logs the plaintext key", but the implementation only logs the prefix (and doesn't log plaintext_key). Either adjust the docstring to match the behavior or (if intended) log/return the plaintext key through a safer channel.
| and logs the plaintext key. This is used for docker compose setups where the ml_backend | |
| needs an API key to self-register. | |
| and logs only the key prefix (not the plaintext key). This is used for docker compose | |
| setups where the ml_backend needs an API key to self-register. |
There was a problem hiding this comment.
Claude says: Good catch — the docstring should say "logs the prefix" not "logs the plaintext key". Will fix.
ui/src/pages/project/processing-services/processing-services-actions.tsx
Outdated
Show resolved
Hide resolved
ui/src/pages/processing-service-details/processing-service-details-dialog.tsx
Outdated
Show resolved
Hide resolved
| @@ -6,4 +6,4 @@ COPY . /app | |||
|
|
|||
| RUN pip install -r ./requirements.txt | |||
There was a problem hiding this comment.
The Dockerfile runs /app/start.sh directly, but there's no guarantee the script has the executable bit set in the image. This can cause permission denied at container start. Consider either RUN chmod +x /app/start.sh or using an explicit shell invocation (e.g., CMD ["bash", "/app/start.sh"]).
| RUN pip install -r ./requirements.txt | |
| RUN pip install -r ./requirements.txt | |
| RUN chmod +x /app/start.sh |
There was a problem hiding this comment.
Claude says: Good point. The start.sh is already set executable in the repo (git ls-files -s shows 755), and COPY preserves permissions. But adding CMD ["bash", "/app/start.sh"] would be more defensive — will update.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ami/ml/migrations/0028_api_key_and_client_info.py (1)
29-34:⚠️ Potential issue | 🟠 Major
namefield default should not beNoneon a non-nullCharField.At Line 31,
default=NoneonCharField(withoutnull=True) can cause invalid stringified values and inconsistent data expectations. Preferdefault=""withblank=True, or allow true nulls withnull=True, blank=Trueand no default.In Django 4.2, what happens when a CharField has default=None but null is not set to True? Also, how is the `name` field defined in `rest_framework_api_key.models.AbstractAPIKey`?Suggested migration field adjustment
( "name", models.CharField( - default=None, + default="", + blank=True, help_text="A free-form name for the API key. Need not be unique. 50 characters max.", max_length=50, ), ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/migrations/0028_api_key_and_client_info.py` around lines 29 - 34, The migration sets default=None on the CharField "name" which is non-nullable and will produce invalid string defaults; update the field definition in this migration to either use a non-null empty-string default (e.g., default="" and blank=True) or make it truly nullable (null=True, blank=True and remove default). Locate the "name" field in this migration (the CharField block with max_length=50) and change the default/null/blank settings to match how rest_framework_api_key.models.AbstractAPIKey defines name (use an empty-string default + blank=True to mirror a non-nullable string field, or explicitly allow nulls if you intend null values).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@ami/ml/migrations/0028_api_key_and_client_info.py`:
- Around line 29-34: The migration sets default=None on the CharField "name"
which is non-nullable and will produce invalid string defaults; update the field
definition in this migration to either use a non-null empty-string default
(e.g., default="" and blank=True) or make it truly nullable (null=True,
blank=True and remove default). Locate the "name" field in this migration (the
CharField block with max_length=50) and change the default/null/blank settings
to match how rest_framework_api_key.models.AbstractAPIKey defines name (use an
empty-string default + blank=True to mirror a non-nullable string field, or
explicitly allow nulls if you intend null values).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 41ea1f27-22f4-46a9-ae20-a411806c6c44
📒 Files selected for processing (1)
ami/ml/migrations/0028_api_key_and_client_info.py
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ami/jobs/views.py`:
- Around line 121-130: _update_processing_service_heartbeat always calls
get_client_info(request) which assumes request.data is a mapping; legacy
bare-list payloads sent to result() make request.data a list and cause
AttributeError. Fix by guarding/normalizing before calling get_client_info:
detect non-mapping request.data (e.g., isinstance(request.data, dict)) and treat
it as {} or construct a safe client_info dict, then set ps.last_seen_client_info
and call ps.mark_seen(live=True); update get_client_info usage in
_update_processing_service_heartbeat to accept the normalized client info and
add a regression test for bare-list /result uploads to ensure no crash.
- Around line 301-314: The code currently accepts wrapped and bare payloads but
doesn't validate that the "results" envelope is a list of mappings, causing
iteration over strings/keys and 500s at PipelineTaskResult(**item); update the
view to validate the envelope before iterating: use ResultRequestSerializer (or
explicit checks) to validate request.data and ensure raw_results is a list,
returning a 400 for invalid shapes; then iterate raw_results and construct
PipelineTaskResult instances (or validate each item with a serializer) to
populate validated_results. Ensure you reference request.data, raw_results,
ResultRequestSerializer, and PipelineTaskResult when adding the validation so
malformed input triggers a 400 rather than an internal error.
In `@ami/ml/models/processing_service.py`:
- Around line 348-355: The current branch where generate_api_key is true
discards the plaintext returned by ProcessingServiceAPIKey.objects.create_key
(called in the block guarded by generate_api_key and
service.api_keys.filter(revoked=False).exists()), so the bootstrap worker can
never authenticate; fix by capturing and persisting or returning the
plaintext_key instead of throwing it away (e.g., return plaintext_key from the
surrounding function or save it to the bootstrap/secret store) or alternatively
skip creating the key here and let the bootstrap flow create and retain the
plaintext; update the code paths that call this branch to consume the
plaintext_key when created and remove the logger.info that only logs
api_key_obj.prefix if you choose to persist/return the plaintext.
In `@ami/ml/views.py`:
- Around line 224-231: Wrap the revoke-and-create sequence in a database
transaction and acquire a row lock on the processing service to make rotation
atomic: start a transaction (e.g., transaction.atomic()), re-load/lock the
service row with select_for_update() (or refresh_from_db with select_for_update)
on the ProcessingService instance before calling
instance.api_keys.filter(revoked=False).update(revoked=True), then call
ProcessingServiceAPIKey.objects.create_key(...) while still inside the
transaction so that revocation and new key creation occur as one atomic, locked
operation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 384bbbcc-2463-4955-80de-79559d227307
📒 Files selected for processing (9)
ami/jobs/views.pyami/ml/admin.pyami/ml/api_key.pyami/ml/models/__init__.pyami/ml/models/processing_service.pyami/ml/tests.pyami/ml/views.pyconfig/settings/base.pyui/src/pages/project/processing-services/processing-services-actions.tsx
✅ Files skipped from review due to trivial changes (1)
- ami/ml/models/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
- ui/src/pages/project/processing-services/processing-services-actions.tsx
- config/settings/base.py
ami/ml/models/processing_service.py
Outdated
| if generate_api_key and not service.api_keys.filter(revoked=False).exists(): | ||
| from ami.ml.api_key import ProcessingServiceAPIKey | ||
|
|
||
| api_key_obj, plaintext_key = ProcessingServiceAPIKey.objects.create_key( | ||
| name=f"{name} key", | ||
| processing_service=service, | ||
| ) | ||
| logger.info("Generated API key for %s (prefix: %s)", name, api_key_obj.prefix) |
There was a problem hiding this comment.
Don't discard the bootstrap key.
create_key() only returns the plaintext once. This branch throws it away and logs just the prefix, so generate_api_key=True creates a key that the worker can never authenticate with. Return/store the plaintext for the bootstrap flow, or skip generating it here.
🧰 Tools
🪛 Ruff (0.15.7)
[warning] 351-351: Unpacked variable plaintext_key is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ami/ml/models/processing_service.py` around lines 348 - 355, The current
branch where generate_api_key is true discards the plaintext returned by
ProcessingServiceAPIKey.objects.create_key (called in the block guarded by
generate_api_key and service.api_keys.filter(revoked=False).exists()), so the
bootstrap worker can never authenticate; fix by capturing and persisting or
returning the plaintext_key instead of throwing it away (e.g., return
plaintext_key from the surrounding function or save it to the bootstrap/secret
store) or alternatively skip creating the key here and let the bootstrap flow
create and retain the plaintext; update the code paths that call this branch to
consume the plaintext_key when created and remove the logger.info that only logs
api_key_obj.prefix if you choose to persist/return the plaintext.
There was a problem hiding this comment.
Claude says: This is the get_or_create_default_processing_service() bootstrap path. The key is intentionally not returned — it's meant for development/compose setups where the key is set via environment variables. The admin action (a1745cb) and API endpoint are the proper ways to generate keys. Will fix the docstring to clarify this.
There was a problem hiding this comment.
Rate Limit Exceeded
@mihow have exceeded the limit for the number of chat messages per hour. Please wait 1 minutes and 18 seconds before sending another message.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (3)
ui/src/pages/processing-service-details/processing-service-details-dialog.tsx (1)
97-137:⚠️ Potential issue | 🟡 MinorLocalization gap remains in the new dialog section labels and values.
This is the same unresolved issue previously raised for this block (
"Authentication","API Key Prefix","Mode","Last Known Worker", etc.).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/pages/processing-service-details/processing-service-details-dialog.tsx` around lines 97 - 137, The hard-coded UI strings in the ProcessingServiceDetailsDialog need to be localized: replace the literal section titles and input labels ("Authentication", "API Key Prefix", "Mode", "Last Known Worker", "Hostname", "Software", "Platform", "Remote Address") and literal display values ("No key generated", "Pull (async)", "Push (sync)") with calls to the app's i18n/localize function (e.g., t or useTranslation) where FormSection and InputValue are rendered; also ensure the software+version concat in the Software InputValue uses localized formatting (e.g., t('softwareVersion', { software, version })) and that any fallback values are localized too. Target the JSX around FormSection, GenerateAPIKey, and the InputValue props in processing-service-details-dialog.tsx when making the changes.ami/ml/views.py (1)
221-232:⚠️ Potential issue | 🟠 MajorAdd row lock to prevent concurrent key generation.
The transaction wrapping is good, but without
select_for_update(), concurrent requests can still race: both read the same keys, both revoke them, and both create new ones—leaving multiple active keys.🔐 Proposed fix with select_for_update
`@action`(detail=True, methods=["post"], url_path="generate_key") def generate_key(self, request: Request, pk=None) -> Response: instance = self.get_object() with transaction.atomic(): + # Lock the row to prevent concurrent key generation + instance = ProcessingService.objects.select_for_update().get(pk=instance.pk) + # Revoke existing keys instance.api_keys.filter(revoked=False).update(revoked=True) # Create new key via library api_key_obj, plaintext_key = ProcessingServiceAPIKey.objects.create_key( name=f"{instance.name} key", processing_service=instance, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/views.py` around lines 221 - 232, The generate_key view currently revokes and creates API keys inside transaction.atomic() but doesn’t lock the processing service row, so concurrent requests can race; modify the code to acquire a row lock before revoking by reloading the instance with select_for_update() (e.g., replace instance = self.get_object() or add a query like self.get_queryset().select_for_update().get(pk=instance.pk)) so that instance.api_keys.filter(revoked=False).update(...) and ProcessingServiceAPIKey.objects.create_key(...) run under the row lock, preserving atomicity and preventing multiple active keys.processing_services/minimal/register.py (1)
103-105:⚠️ Potential issue | 🟡 MinorGuard
get_own_pipeline_configs()with explicit error handling.Line 104 is still unguarded; failures bubble up as a raw traceback. Log context and fail cleanly so startup diagnostics are clear.
🔧 Proposed fix
# Fetch our own pipeline configs - pipelines = get_own_pipeline_configs() - client_info = get_client_info() + try: + pipelines = get_own_pipeline_configs() + client_info = get_client_info() + except Exception as e: + logger.error(f"get_own_pipeline_configs() failed: {e}") + sys.exit(1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@processing_services/minimal/register.py` around lines 103 - 105, Wrap the call to get_own_pipeline_configs() in explicit error handling: catch exceptions around the get_own_pipeline_configs() call (the assignment to pipelines), log a clear contextual error message including the exception details (using the module logger or existing process logger), and then fail cleanly (raise a controlled exception or exit with a non-zero status) so startup diagnostics are readable; ensure this does not swallow exceptions from the subsequent get_client_info() call.
🧹 Nitpick comments (2)
ui/src/pages/project/processing-services/processing-services-actions.tsx (1)
60-87: Provide a regeneration path after a key is shown.Once
apiKeyexists, the component permanently switches to the “shown once” panel and no longer exposes the generation action in this mounted view.Proposed fix
<div className="flex items-center gap-2"> @@ <Button onClick={handleCopy} size="small" variant="outline"> {copied ? 'Copied' : 'Copy'} </Button> + <Button + onClick={() => generateAPIKey(processingService.id)} + size="small" + variant="outline" + disabled={isLoading} + > + Regenerate + </Button> </div>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/pages/project/processing-services/processing-services-actions.tsx` around lines 60 - 87, The current UI shows the generated apiKey but provides no way to regenerate it; update the panel rendered when apiKey is present to include a "Regenerate" action (e.g., a Button next to the Copy/visibility controls) that invokes the existing key-generation/regeneration handler (or add a regenerateKey function if none exists), and wire it to update the component state so apiKey is replaced with the new key, reset copied to false and visible to false after regeneration, and ensure any confirmation/async error handling runs before replacing state; reference the apiKey, handleCopy, visible, setVisible, copied and Button symbols when implementing this.ami/ml/views.py (1)
322-326: Consider moving the import to the top of the file.The import of
get_client_infoinside the method works but is unconventional. If there's no circular import issue, moving it to the top-level imports improves readability and avoids repeated import overhead on each request.♻️ Suggested change
Add to the top-level imports:
from ami.ml.serializers_client_info import get_client_infoThen simplify the method:
- # Update heartbeat and client info - from ami.ml.serializers_client_info import get_client_info - - processing_service.last_seen_client_info = get_client_info(request) + processing_service.last_seen_client_info = get_client_info(request) processing_service.mark_seen(live=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/views.py` around lines 322 - 326, The inline import of get_client_info inside the method should be moved to the module top-level to improve readability and avoid repeated import overhead: add "from ami.ml.serializers_client_info import get_client_info" to the top imports of ami/ml/views.py and remove the local import before setting processing_service.last_seen_client_info; if a circular import prevents this, add a brief comment explaining why the local import remains and keep the current local import only in that method (refer to get_client_info and processing_service.last_seen_client_info/mark_seen calls to locate the usage).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@processing_services/minimal/register.py`:
- Around line 91-95: The loop using MAX_RETRIES treats any HTTP response as
success; modify the try block around requests.get("http://localhost:2000/livez",
timeout=2) to validate the response status (e.g., call
response.raise_for_status() or check response.status_code == 200) and only break
the loop on a successful 2xx response, and expand the except clause to also
catch requests.HTTPError (or use requests.RequestException) so 4xx/5xx responses
are retried instead of considered ready; update the handling around this
requests.get call in register.py accordingly.
In
`@ui/src/pages/processing-service-details/processing-service-details-dialog.tsx`:
- Around line 119-124: The Software display currently only shows version when
software exists; update the value computation in
processing-service-details-dialog (the JSX using
processingService.lastSeenClientInfo) to show both when available and fall back
to whichever exists: if both software and version present use `${software}
${version}`, else use software || version || an empty string so version-only
payloads are shown; adjust the expression where value is set
(processingService.lastSeenClientInfo.software/version) accordingly.
In `@ui/src/pages/project/processing-services/processing-services-actions.tsx`:
- Around line 63-65: Replace hardcoded user-facing strings in the
processing-services-actions component (e.g. the text node "API Key (shown once,
copy it now):" inside the <p className="text-sm font-medium"> and the other
newly added strings noted in the review) with calls to translate(...) and add
corresponding i18n keys; import the existing translate utility used elsewhere in
the app, wrap each visible string in translate('your.key.here') (or
translate('namespace.key', { defaultValue: 'API Key (shown once, copy it now):'
}) as per project convention), and update the translation resource files with
those keys so the UI remains localizable.
- Around line 70-80: The visibility toggle Button (using visible, setVisible and
rendering Eye/EyeOff) is icon-only and needs an accessible name; update the
Button props to include an aria-label that changes with state (e.g.,
aria-label={visible ? "Hide details" : "Show details"}) and include an
appropriate ARIA state such as aria-pressed={visible} to indicate toggle state
to assistive tech; make this change on the Button that wraps Eye/EyeOff so
screen readers get a clear, stateful label.
---
Duplicate comments:
In `@ami/ml/views.py`:
- Around line 221-232: The generate_key view currently revokes and creates API
keys inside transaction.atomic() but doesn’t lock the processing service row, so
concurrent requests can race; modify the code to acquire a row lock before
revoking by reloading the instance with select_for_update() (e.g., replace
instance = self.get_object() or add a query like
self.get_queryset().select_for_update().get(pk=instance.pk)) so that
instance.api_keys.filter(revoked=False).update(...) and
ProcessingServiceAPIKey.objects.create_key(...) run under the row lock,
preserving atomicity and preventing multiple active keys.
In `@processing_services/minimal/register.py`:
- Around line 103-105: Wrap the call to get_own_pipeline_configs() in explicit
error handling: catch exceptions around the get_own_pipeline_configs() call (the
assignment to pipelines), log a clear contextual error message including the
exception details (using the module logger or existing process logger), and then
fail cleanly (raise a controlled exception or exit with a non-zero status) so
startup diagnostics are readable; ensure this does not swallow exceptions from
the subsequent get_client_info() call.
In
`@ui/src/pages/processing-service-details/processing-service-details-dialog.tsx`:
- Around line 97-137: The hard-coded UI strings in the
ProcessingServiceDetailsDialog need to be localized: replace the literal section
titles and input labels ("Authentication", "API Key Prefix", "Mode", "Last Known
Worker", "Hostname", "Software", "Platform", "Remote Address") and literal
display values ("No key generated", "Pull (async)", "Push (sync)") with calls to
the app's i18n/localize function (e.g., t or useTranslation) where FormSection
and InputValue are rendered; also ensure the software+version concat in the
Software InputValue uses localized formatting (e.g., t('softwareVersion', {
software, version })) and that any fallback values are localized too. Target the
JSX around FormSection, GenerateAPIKey, and the InputValue props in
processing-service-details-dialog.tsx when making the changes.
---
Nitpick comments:
In `@ami/ml/views.py`:
- Around line 322-326: The inline import of get_client_info inside the method
should be moved to the module top-level to improve readability and avoid
repeated import overhead: add "from ami.ml.serializers_client_info import
get_client_info" to the top imports of ami/ml/views.py and remove the local
import before setting processing_service.last_seen_client_info; if a circular
import prevents this, add a brief comment explaining why the local import
remains and keep the current local import only in that method (refer to
get_client_info and processing_service.last_seen_client_info/mark_seen calls to
locate the usage).
In `@ui/src/pages/project/processing-services/processing-services-actions.tsx`:
- Around line 60-87: The current UI shows the generated apiKey but provides no
way to regenerate it; update the panel rendered when apiKey is present to
include a "Regenerate" action (e.g., a Button next to the Copy/visibility
controls) that invokes the existing key-generation/regeneration handler (or add
a regenerateKey function if none exists), and wire it to update the component
state so apiKey is replaced with the new key, reset copied to false and visible
to false after regeneration, and ensure any confirmation/async error handling
runs before replacing state; reference the apiKey, handleCopy, visible,
setVisible, copied and Button symbols when implementing this.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 485db8dd-9b81-4e01-84f3-c2bbd667fdbb
📒 Files selected for processing (10)
ami/ml/models/processing_service.pyami/ml/serializers_client_info.pyami/ml/views.pyprocessing_services/minimal/Dockerfileprocessing_services/minimal/register.pyprocessing_services/minimal/start.shrequirements/base.txtui/src/data-services/models/processing-service.tsui/src/pages/processing-service-details/processing-service-details-dialog.tsxui/src/pages/project/processing-services/processing-services-actions.tsx
✅ Files skipped from review due to trivial changes (4)
- processing_services/minimal/Dockerfile
- requirements/base.txt
- processing_services/minimal/start.sh
- ami/ml/serializers_client_info.py
🚧 Files skipped from review as they are similar to previous changes (1)
- ui/src/data-services/models/processing-service.ts
ui/src/pages/processing-service-details/processing-service-details-dialog.tsx
Outdated
Show resolved
Hide resolved
ui/src/pages/project/processing-services/processing-services-actions.tsx
Outdated
Show resolved
Hide resolved
ui/src/pages/project/processing-services/processing-services-actions.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
ui/src/pages/processing-service-details/processing-service-details-dialog.tsx (1)
97-106:⚠️ Potential issue | 🟡 MinorNew details-section labels and values should be localized.
At Line 97–137, labels/titles/fallback values (e.g., “Authentication”, “API Key Prefix”, “Mode”, “Last Known Worker”, “No key generated”, “Pull (async)”) are hardcoded. Please move these to translation keys for consistency with the rest of the dialog.
Also applies to: 111-137
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/pages/processing-service-details/processing-service-details-dialog.tsx` around lines 97 - 106, The hardcoded UI strings in the ProcessingServiceDetailsDialog must be replaced with translation keys; update the FormSection title and each InputValue label and fallback/value text (e.g., "Authentication", "API Key Prefix", "Mode", "Last Known Worker", "No key generated", "Pull (async)"/"Push (sync)") to use the i18n translation function used elsewhere (e.g., t('...')) so they pull from localization files; modify the JSX in ProcessingServiceDetailsDialog where FormSection and InputValue are rendered (and any computed fallback/text for processingService.apiKeyPrefix, processingService.isAsync, processingService.lastKnownWorker, etc.) to call the translation keys and add new keys to the locale resource files.ui/src/pages/project/processing-services/processing-services-actions.tsx (2)
72-77:⚠️ Potential issue | 🟡 MinorExpose pressed state on the visibility toggle.
At Line 72–77, this is a toggle button; add
aria-pressed={visible}so assistive tech gets the current state, not just the action label.Suggested patch
<Button aria-label={visible ? 'Hide API key' : 'Show API key'} + aria-pressed={visible} onClick={() => setVisible((v) => !v)} size="small" variant="ghost" >🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/pages/project/processing-services/processing-services-actions.tsx` around lines 72 - 77, The toggle Button rendering in processing-services-actions.tsx currently only provides an action label; update the Button component (the one using visible and setVisible) to expose its pressed state by adding aria-pressed={visible} so assistive technologies receive the current on/off state; ensure the prop is placed alongside aria-label, onClick, size, and variant on that Button element.
65-67:⚠️ Potential issue | 🟡 MinorLocalize the newly added API-key text strings.
These user-facing literals are still hardcoded (e.g., “API Key (shown once, copy it now):”, “Copy/Copied”, tooltip messages, and “Generate/Regenerate API Key”). Please route them through
translate(...)for i18n consistency.Also applies to: 85-85, 97-100, 114-115
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/pages/project/processing-services/processing-services-actions.tsx` around lines 65 - 67, In the ProcessingServicesActions component (ui/src/pages/project/processing-services/processing-services-actions.tsx) replace hardcoded user-visible strings with translate(...) calls: wrap "API Key (shown once, copy it now):" in translate, localize the Copy/Copied button labels (used in the copy button state), the tooltip message(s) shown on hover, and the "Generate" / "Regenerate API Key" button text; ensure every literal in the JSX (including tooltip content and button labels referenced in the component functions or state) uses translate(...) so i18n keys are consistent.ami/ml/views.py (1)
223-231:⚠️ Potential issue | 🟠 MajorKey rotation is still not concurrency-safe without a row lock.
At Line 223–231, concurrent
generate_keycalls can still create multiple active keys. The prior concern remains: lock theProcessingServicerow inside the transaction before revoking/creating.Suggested patch
with transaction.atomic(): + instance = ProcessingService.objects.select_for_update().get(pk=instance.pk) # Revoke existing keys instance.api_keys.filter(revoked=False).update(revoked=True) # Create new key via library api_key_obj, plaintext_key = ProcessingServiceAPIKey.objects.create_key( name=f"{instance.name} key", processing_service=instance, )#!/bin/bash # Verify generate_key flow and absence/presence of row lock rg -n -C4 'def generate_key|transaction\.atomic|select_for_update|create_key|api_keys\.filter\(revoked=False\)' ami/ml/views.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/ml/views.py` around lines 223 - 231, The generate_key path is not concurrency-safe because the ProcessingService row isn't locked before revoking/creating keys; inside the existing transaction.atomic block, re-fetch and lock the ProcessingService row using select_for_update (e.g., query the ProcessingService by PK and assign back to instance) before calling instance.api_keys.filter(revoked=False).update(...) and before calling ProcessingServiceAPIKey.objects.create_key, so concurrent requests serialize on the row and only one active key is created.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ui/src/data-services/hooks/processing-services/useProcessingServiceDetails.ts`:
- Around line 21-25: The URL construction in useProcessingServiceDetails
interpolates projectId raw into params, which can break if projectId contains
reserved URL characters; update the params creation to encode projectId (e.g.,
use encodeURIComponent(projectId) or URLSearchParams) so the query string is
safe, and ensure the constructed url passed to useAuthorizedQuery uses the
encoded params variable (refer to params, projectId, and useAuthorizedQuery in
this file).
---
Duplicate comments:
In `@ami/ml/views.py`:
- Around line 223-231: The generate_key path is not concurrency-safe because the
ProcessingService row isn't locked before revoking/creating keys; inside the
existing transaction.atomic block, re-fetch and lock the ProcessingService row
using select_for_update (e.g., query the ProcessingService by PK and assign back
to instance) before calling instance.api_keys.filter(revoked=False).update(...)
and before calling ProcessingServiceAPIKey.objects.create_key, so concurrent
requests serialize on the row and only one active key is created.
In
`@ui/src/pages/processing-service-details/processing-service-details-dialog.tsx`:
- Around line 97-106: The hardcoded UI strings in the
ProcessingServiceDetailsDialog must be replaced with translation keys; update
the FormSection title and each InputValue label and fallback/value text (e.g.,
"Authentication", "API Key Prefix", "Mode", "Last Known Worker", "No key
generated", "Pull (async)"/"Push (sync)") to use the i18n translation function
used elsewhere (e.g., t('...')) so they pull from localization files; modify the
JSX in ProcessingServiceDetailsDialog where FormSection and InputValue are
rendered (and any computed fallback/text for processingService.apiKeyPrefix,
processingService.isAsync, processingService.lastKnownWorker, etc.) to call the
translation keys and add new keys to the locale resource files.
In `@ui/src/pages/project/processing-services/processing-services-actions.tsx`:
- Around line 72-77: The toggle Button rendering in
processing-services-actions.tsx currently only provides an action label; update
the Button component (the one using visible and setVisible) to expose its
pressed state by adding aria-pressed={visible} so assistive technologies receive
the current on/off state; ensure the prop is placed alongside aria-label,
onClick, size, and variant on that Button element.
- Around line 65-67: In the ProcessingServicesActions component
(ui/src/pages/project/processing-services/processing-services-actions.tsx)
replace hardcoded user-visible strings with translate(...) calls: wrap "API Key
(shown once, copy it now):" in translate, localize the Copy/Copied button labels
(used in the copy button state), the tooltip message(s) shown on hover, and the
"Generate" / "Regenerate API Key" button text; ensure every literal in the JSX
(including tooltip content and button labels referenced in the component
functions or state) uses translate(...) so i18n keys are consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cab41e63-64e1-4935-8b6d-66194208571c
📒 Files selected for processing (6)
ami/ml/views.pyprocessing_services/minimal/register.pyui/src/data-services/hooks/processing-services/useGenerateAPIKey.tsui/src/data-services/hooks/processing-services/useProcessingServiceDetails.tsui/src/pages/processing-service-details/processing-service-details-dialog.tsxui/src/pages/project/processing-services/processing-services-actions.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- processing_services/minimal/register.py
- ui/src/data-services/hooks/processing-services/useGenerateAPIKey.ts
ui/src/data-services/hooks/processing-services/useProcessingServiceDetails.ts
Outdated
Show resolved
Hide resolved
mihow
left a comment
There was a problem hiding this comment.
I suggest we move the UI changes out of this. Also possibly the get->post change for the task retrieval to it's own, which would be merged quickly.
| return (AnonymousUser(), api_key.processing_service) | ||
|
|
||
| def authenticate_header(self, request): | ||
| return "Api-Key" |
There was a problem hiding this comment.
is this class and any methods already part of the DRF api_key package?
There was a problem hiding this comment.
Claude says: BaseHasAPIKey from the library handles key validation (is_valid() check), but our HasProcessingServiceAPIKey serves a different purpose — it checks authorization (project membership) after the auth backend has already validated the key. The auth backend (ProcessingServiceAPIKeyAuthentication) handles authentication and puts the PS on request.auth. The permission class then checks that the PS belongs to the right project. So they complement each other rather than overlap — we can't inherit from BaseHasAPIKey because we need the isinstance(request.auth, ProcessingService) check rather than re-validating the key.
There was a problem hiding this comment.
Claude says: The library provides BaseHasAPIKey (permission class) and KeyParser (header extraction), but not a DRF authentication backend. Our ProcessingServiceAPIKeyAuthentication serves a different role — it's an auth backend that sets request.auth = ProcessingService, which DRF's auth pipeline requires. The library's BaseHasAPIKey is a permission class that checks key validity but doesn't identify the caller.
We use KeyParser from the library (for header parsing) and its get_from_key() manager method (for hashed lookup). The auth backend and permission class are ours because they need to: (1) place the PS on request.auth, and (2) check project membership — both are app-specific concerns.
0b2cb95 to
73e32a9
Compare
Merge plan: #1197 → #1194PR #1197 (endpoint contracts) should merge first. Then #1194 rebases on top. Conflict resolution guide
Heartbeat simplificationBefore (current main + #1194): # Broad: updates ALL async services on the pipeline within this project
_mark_pipeline_pull_services_seen(job)
# Narrow: updates the specific PS identified by API key (added alongside the above)
self._update_processing_service_heartbeat(request)After (post-rebase): # API key auth: update the specific PS directly
if isinstance(request.auth, ProcessingService):
request.auth.update_heartbeat(get_client_info(request))The broad |
Updated merge plan (coordinated deploy, no transition period)Conflict resolution guide (post-rebase on #1197)
|
9c083b7 to
37f6335
Compare
37f6335 to
888b7df
Compare
Add djangorestframework-api-key based authentication for processing services. Each ProcessingService can have API keys managed via Django admin or API. Authenticated requests identify the specific service, enabling per-PS heartbeat tracking and client_info collection. Changes: - ProcessingServiceAPIKey model (AbstractAPIKey) with FK to ProcessingService - DRF auth backend (ProcessingServiceAPIKeyAuthentication) + permission class - HasProcessingServiceAPIKey permission on /tasks and /result endpoints - Per-PS heartbeat with client_info (ip, user_agent, hostname, software, etc.) - generate_key API action and Django admin action - Dual-auth pipeline registration (API key or legacy token) - Serializers refactored into package with client_info extraction - Minimal processing service self-registration example Jobs endpoint HTTP semantics unchanged (GET /tasks, bare list /result). The GET→POST refactor and request serializers are in PR #1197. Co-Authored-By: Claude <noreply@anthropic.com>
…rvice, add self-provisioning - Fix migration CharField default=None → default="" to avoid storing "None" string - Remove generate_api_key param from get_or_create_default_processing_service and mark function as deprecated in favor of self-registration flow - Expand minimal example register.py with self-provisioning mode: when no API key is set, logs in with user credentials, creates the processing service via REST API, and generates its own API key Co-Authored-By: Claude <noreply@anthropic.com>
…orts Co-Authored-By: Claude <noreply@anthropic.com>
ce6a602 to
18545d5
Compare
Replace user token auth (Authorization: Token) with API key auth (Authorization: Api-Key) for all Antenna API requests. Remove processing_service_name from registration (service is now identified by its API key). Add client_info metadata to pipeline registration. - Replace antenna_api_auth_token setting with antenna_api_key - Remove antenna_service_name setting (managed in Antenna now) - Update get_http_session to use Api-Key header - Add client_info (hostname, software, version, platform) to registration - Remove get_full_service_name helper (no longer needed) - Update all callers, tests, and benchmark Companion to RolnickLab/antenna#1194 Co-Authored-By: Claude <noreply@anthropic.com>
…y param
For detail views like /jobs/{pk}/tasks/, derive the project from the
object rather than requiring project_id as a query parameter. The
has_object_permission check already verifies the PS belongs to the
job's project.
Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Adds API key authentication and per-worker identity tracking for processing services (ML workers). Each ProcessingService gets API keys managed via Django admin or the REST API. Authenticated workers are identified per-request, enabling targeted heartbeat tracking and client metadata collection.
ProcessingServiceAPIKeymodel (backed by djangorestframework-api-key) with FK to ProcessingServiceHasProcessingServiceAPIKeypermission classmark_seen(client_info=...)consolidates heartbeat + identity into one methodProcessingServiceClientInfoPydantic schema with typed fields (hostname, software, version, platform, pod_name) + server-observed fields (ip, user_agent)generate_keyAPI action for key rotationprocessing_services/minimal/get_or_create_default_processing_servicedeprecated (async services should self-provision)Jobs endpoint HTTP semantics are unchanged (GET /tasks, bare list /result). The GET→POST refactor is in PR #1197.
What processing services need to change
Authentication
All requests to Antenna's worker-facing endpoints now require an API key:
Keys are generated in Django admin (Processing Services → select service → "Generate API key" action) or via
POST /api/v2/processing-services/{id}/generate_key/. The full key is shown only once at creation.Pipeline registration
Endpoint:
POST /api/v2/projects/{project_id}/pipelines/The
processing_service_namefield has been removed. The service is identified by its API key. The request body is now:{ "pipelines": [ ... ], "client_info": { "hostname": "cedar-node-01", "software": "ami-data-companion", "version": "2.1.0", "platform": "Linux x86_64" } }client_infois optional but recommended. Extra fields beyond the schema are allowed and stored.Task fetching and result submission
GET /jobs/{id}/tasks/andPOST /jobs/{id}/result/now accept API key auth alongside user token auth. When using an API key, the server records a per-service heartbeat with client metadata automatically.No changes to the request/response format of these endpoints in this PR.
Example
See
processing_services/minimal/register.pyfor a complete self-registration implementation, including a self-provisioning mode for local development.Changes needed in AMI Data Companion
The ADC worker currently authenticates with
Authorization: Token <token>and identifies itself viaprocessing_service_namequery params. After this PR merges, the ADC needs the following changes:1. Switch to API key auth for worker endpoints
Files:
trapdata/api/utils.py(session creation),trapdata/settings.pyAMI_ANTENNA_API_KEY(env var) alongside the existingAMI_ANTENNA_API_AUTH_TOKENAMI_ANTENNA_API_KEYis set, useAuthorization: Api-Key <key>header instead ofTokenprocessing_service_namequery params become unnecessary2. Remove
processing_service_namefrom all requestsFiles:
trapdata/antenna/client.py,trapdata/antenna/datasets.pyprocessing_service_namequery parameter from job listing (GET /jobs), task fetching (GET /jobs/{id}/tasks), and result submission (POST /jobs/{id}/result/)3. Update pipeline registration
File:
trapdata/antenna/registration.py,trapdata/antenna/schemas.pyprocessing_service_namefrom the registration POST body (AsyncPipelineRegistrationRequest)client_infodict to the registration body:{ "pipelines": [...], "client_info": { "hostname": socket.gethostname(), "software": "ami-data-companion", "version": __version__, "platform": platform.platform(), } }Api-Keyheader), not user token authprocessing_service_namefield inAsyncPipelineRegistrationRequestschema can be removed4. (Optional) Add self-provisioning for local dev
For docker compose setups, the ADC can self-provision (create its own ProcessingService + API key) using user credentials, similar to
processing_services/minimal/register.py. This removes the need forget_or_create_default_processing_serviceon the Antenna side.Migration path
TokenandApi-Keyauth on worker endpoints. ADC continues working with token auth unchanged.Api-Keyauth, removesprocessing_service_nameparams, addsclient_infoto registration.processing_service_namehandling from Antenna (already a no-op after this PR).Test plan