AI Models Based App Size Optimization#1263
Conversation
WalkthroughRemoves bundled ONNX files from the installer and implements a modular system: model registry, resumable SHA‑256‑verified downloads, hardware detection, lazy ONNX session initialization, backend APIs (including SSE progress), frontend onboarding/settings integration, and a PyInstaller spec that excludes .onnx artifacts. ChangesModular AI Model Management System
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/routes/folders.py (1)
235-245:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo model preflight synchronously before returning success.
enable_ai_tagging()returns success immediately after scheduling background work, but missing-model failures happen later and are only logged. This can report AI tagging as enabled when processing cannot actually start.As per coding guidelines, "Confirm that the code meets the project's requirements and objectives".Suggested fix
def enable_ai_tagging(request: UpdateAITaggingRequest, app_state=Depends(get_state)): """Enable AI tagging for multiple folders.""" try: if not request.folder_ids: raise ValueError("No folder IDs provided") + # Validate model prerequisites before mutating state/responding success. + ensure_ai_tagging_models() + updated_count = db_enable_ai_tagging_batch(request.folder_ids) executor: ProcessPoolExecutor = app_state.executor executor.submit(post_AI_tagging_enabled_sequence)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routes/folders.py` around lines 235 - 245, enable_ai_tagging currently updates DB and schedules post_AI_tagging_enabled_sequence in the background but doesn't verify models are available; perform the model preflight synchronously before returning success by calling the model-check/preflight routine (or invoking post_AI_tagging_enabled_sequence synchronously or a dedicated ensure_models_available() helper) and raise/return an error if the preflight fails, then only call db_enable_ai_tagging_batch and schedule the background executor (app_state.executor.submit) after the preflight passes; reference enable_ai_tagging, db_enable_ai_tagging_batch, post_AI_tagging_enabled_sequence, and app_state.executor when making the change.
🧹 Nitpick comments (17)
backend/app/utils/hardware_detect.py (1)
64-71: ⚡ Quick winAvoid duplicate GPU probing in one response path.
get_hardware_info()already computesgpu_names, butrecommended_tiertriggers anotherdetect_physical_gpu()call throughdetect_hardware_tier(). Reuse the same snapshot to avoid extra subprocess overhead and inconsistent results.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/utils/hardware_detect.py` around lines 64 - 71, get_hardware_info currently calls detect_physical_gpu() to populate gpu_names but then calls detect_hardware_tier() which re-probes GPUs; change the flow so detect_hardware_tier reuses the already-computed gpu_names instead of re-running detection: update detect_hardware_tier to accept an optional gpu_names/list parameter (or add a new helper detect_hardware_tier_from_list) and call it from get_hardware_info with the gpu_names variable to compute recommended_tier, avoiding the duplicate detect_physical_gpu() subprocess call and ensuring consistent results.backend/app/config/settings.py (1)
6-11: ⚡ Quick winKeep model export path resolution in one place.
This frozen-vs-dev path selection now exists here and again in
backend/app/models/model_registry.py. If those branches drift, model downloads and model loads will start targeting different directories. Please derive this from one shared helper/constant instead of duplicating the logic.As per coding guidelines, "Look for code duplication."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/config/settings.py` around lines 6 - 11, Consolidate the frozen-vs-dev model path logic into a single helper in settings by exposing a single computed symbol (e.g., MODEL_EXPORTS_PATH or a function get_model_exports_path()) that encapsulates the getattr(sys, "frozen", False) branch; then remove the duplicate branch in backend/app/models/model_registry.py and import and use that settings symbol instead so both model download and load paths reference the same source of truth (replace any local path computation in model_registry.py with MODEL_EXPORTS_PATH/get_model_exports_path()).backend/app/models/YOLO.py (4)
22-24: ⚡ Quick winMove
threadingimport to module top-level.
import threadingis repeated on every__init__call. Hoist it to the module imports (alongsideonnxruntime, etc.) for PEP 8 compliance and a tiny perf win.Proposed diff
import onnxruntime import time +import threading import cv2 import numpy as np @@ def __init__(self, path, conf_threshold=0.7, iou_threshold=0.5): self.model_path = path self.conf_threshold = conf_threshold self.iou_threshold = iou_threshold self._session = None - import threading self._lock = threading.Lock()As per coding guidelines: "Check for PEP 8 violations and Python best practices."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/YOLO.py` around lines 22 - 24, The class __init__ currently performs a local import ("import threading") and then uses threading.Lock() to set self._lock, causing repeated imports; move the import of threading to the module top-level (alongside existing imports like onnxruntime) and remove the in-method import so __init__ only does self._lock = threading.Lock(), updating YOLO.__init__ accordingly.
33-46: ⚡ Quick winHoist imports and tighten model-key matching.
Two issues here:
import osandfrom app.models.model_registry import MODEL_REGISTRYare insideget_session(); they should be at module top-level (PEP 8).spec["filename"] in self.model_pathdoes a substring match against the full path. If a registry filename ever becomes a substring of another (e.g., a futureyolo.onnxvsyolov8n.onnx), the wrong key wins. Compare againstos.path.basename(self.model_path)with equality instead.Proposed diff
- with self._lock: - if self._session is None: - import os - from app.models.model_registry import MODEL_REGISTRY - - if not os.path.exists(self.model_path): - model_key = None - for key, spec in MODEL_REGISTRY.items(): - if spec["filename"] in self.model_path: - model_key = key - break - model_name = model_key if model_key else os.path.basename(self.model_path) + with self._lock: + if self._session is None: + if not os.path.exists(self.model_path): + basename = os.path.basename(self.model_path) + model_key = next( + (k for k, spec in MODEL_REGISTRY.items() if spec["filename"] == basename), + None, + ) + model_name = model_key if model_key else basenameAs per coding guidelines: "Check for PEP 8 violations and Python best practices."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/YOLO.py` around lines 33 - 46, Move the imports "import os" and "from app.models.model_registry import MODEL_REGISTRY" out of get_session() to the module top-level, and update the model lookup in get_session() to compare os.path.basename(self.model_path) for equality against spec["filename"] (instead of using the substring check spec["filename"] in self.model_path) so MODEL_REGISTRY key selection (model_key) is exact; keep raising the same RuntimeError with model_name derived from the matched key or basename when not found.
84-93: 💤 Low valueLock-free read of
self._sessionin detail getters.
get_input_details()andget_output_details()readself._sessiondirectly. Today they're only called from insideget_session()while the lock is held andself._sessionis non-None, so this is fine. If anyone later calls these from outsideget_session()after aclose(), they'll hitAttributeError: 'NoneType'. Worth either (a) adding a docstring marking them as private/internal, or (b) routing them throughget_session()so they remain robust.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/YOLO.py` around lines 84 - 93, The getters get_input_details() and get_output_details() read self._session directly and can raise AttributeError if called after close(); change them to obtain the session via get_session() (or at minimum assert/check it's not None) before accessing inputs/outputs: call sess = self.get_session() (or if you prefer an explicit guard, raise a clear runtime error when self._session is None) and then use sess.get_inputs()/sess.get_outputs(); update references to self._session in get_input_details and get_output_details and add a short docstring noting they rely on an active session.
60-64: 💤 Low value
close()is inconsistent withFaceNet.close().
FaceNet.close()also resetsinput_tensor_name/output_tensor_nameunder the lock, but hereself.input_names/self.output_names/self.input_shapeare left dangling. Afterclose(), a subsequentget_session()will repopulate them viaget_input_details()/get_output_details(), so this isn't a correctness bug, but the asymmetry across the two model classes is worth aligning for consistency and easier reasoning.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/YOLO.py` around lines 60 - 64, The YOLO.close() method currently only clears self._session under the lock, leaving self.input_names, self.output_names, and self.input_shape set; update YOLO.close() to mirror FaceNet.close() by resetting self.input_names, self.output_names, and self.input_shape (inside the same with self._lock block) when clearing the session so the internal state is consistent and no dangling tensor metadata remains.frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx (4)
85-110: ⚡ Quick winStatus check has no abort/timeout; can wedge the spinner indefinitely.
If the backend at
${BACKEND_URL}/models/statusis unreachable, the UI sits on "Verifying AI models…" until the OS-level fetch eventually times out. Two recommendations:
- Use an
AbortControllerin the cleanup so an unmount/remount cancels the in-flight request.- Add a reasonable timeout (e.g., 5–10s) so the user is moved to the recommendation view even if the backend is slow.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines 85 - 110, The model-status check in useEffect (checkModelStatus) can hang indefinitely; wrap the fetch with an AbortController and a timeout so the request is cancelled on cleanup or after ~5–10s: create an AbortController, pass its signal to fetch in checkModelStatus, start a timeout that calls controller.abort() after the chosen duration, clear the timeout and call controller.abort() in the useEffect cleanup, and ensure setIsCheckingStatus(false) and dispatch(markCompleted(stepIndex)) still run appropriately when aborted or timed out.
144-156: ⚡ Quick winUntyped
await response.json()for/models/setup.
datais implicitlyanyhere, thendata.success/data.detail/data.task_idare accessed without a type. Define a small response interface infrontend/src/types/models.ts(e.g.,SetupResponse) and type the parsed JSON to keep the Action's contract explicit.Proposed diff
- const response = await fetch(`${BACKEND_URL}/models/setup`, { + const response = await fetch(`${BACKEND_URL}/models/setup`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ tier: selectedTier }), }); - - const data = await response.json(); + const data: SetupResponse = await response.json();As per coding guidelines: "Avoid 'any', use explicit types."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines 144 - 156, Add an explicit response type for the /models/setup call by creating a SetupResponse interface in frontend/src/types/models.ts (fields: success: boolean, detail?: string, task_id?: string) and import it into AIModelSetupStep.tsx; then type the parsed JSON as SetupResponse (e.g., const data: SetupResponse = await response.json()) so accesses to data.success / data.detail / data.task_id are statically typed and conform to the Action contract.
88-89: ⚖️ Poor tradeoffUntyped JSON parse for
/models/statusand/models/hardware.
await res.json()returnsanyand is then cast positionally viaconst data: ModelStatusResponse = await res.json(). The cast is silently unsafe — if the backend ever drifts from the contract, runtime access ondata.datawon't be caught at compile time. Consider adding a lightweight runtime guard (e.g., zod, or a hand-written type predicate) to validate the shape before consumption, especially sincegetInstalledModelTiersiterates blindly overdata.data.As per coding guidelines: "Avoid 'any', use explicit types."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines 88 - 89, The fetch call uses await res.json() and casts it to ModelStatusResponse (const data: ModelStatusResponse = await res.json()), which is unsafe; add a runtime shape check for the /models/status (and similarly /models/hardware) response before using data.data — e.g., define a lightweight validator (Zod schema or a hand-written type predicate) that asserts the top-level object has the expected fields (e.g., data: Array with required properties), parse/validate the parsed JSON (instead of blind casting), and if validation fails handle the error path (log/show an error and avoid iterating in getInstalledModelTiers), ensuring BACKEND_URL fetch handling checks res.ok and uses the validated value thereafter.
113-136: 💤 Low valueHardware-recommendation effect re-fires on every transition back to
recommendation.Because
viewModeis in the deps and the body branches onviewMode === 'recommendation', this effect re-fetches/models/hardwareevery time the user clicks "Try Again" from the error view. That's network-cheap, but it also overwritesselectedTiereven if the user had already adjusted the dropdown. Consider either (a) only fetching once on mount and storing the result, or (b) preserving the user's last-selected tier when re-fetching.Also, the cleanup
eventSourceRef.current?.close()is shared with the download flow but lives inside this effect's cleanup, which means it runs on everyviewModechange rather than only on unmount. Functionally fine (it just runs more often), but moving the SSE cleanup to its own unmount-only effect would make the lifecycle clearer.As per coding guidelines: "Check for proper React hook dependencies and prevent unnecessary re-renders."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines 113 - 136, The effect currently refetches on every viewMode change and always overwrites selectedTier and also closes eventSourceRef on every dep change; change it so the hardware fetch only runs once (e.g., move the fetch into a useEffect with empty deps or guard with a fetchedRef) and store the recommended tier in state via setRecommendedTier without unconditionally calling setSelectedTier if the user has already changed the dropdown (use a userChanged flag or only setSelectedTier when selectedTier is undefined/default). Also extract the SSE cleanup into its own unmount-only useEffect that returns () => { if (eventSourceRef.current) eventSourceRef.current.close(); } so eventSourceRef is closed only on unmount and not on every viewMode change; keep references to useEffect, viewMode, fetch(.../models/hardware), setRecommendedTier, setSelectedTier, and eventSourceRef to locate the changes.backend/app/routes/models.py (3)
5-5: 💤 Low valueUnused import
Optional.
Optionalis imported fromtypingbut never referenced.Proposed diff
-from typing import Dict, Optional +from typing import Dict🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routes/models.py` at line 5, The import list in models.py includes an unused symbol Optional from typing; remove Optional from the import statement so only used types remain (e.g., change "from typing import Dict, Optional" to "from typing import Dict") and run linters/tests to confirm no other references to Optional exist in functions or classes in this module.
49-68: 💤 Low valueSync handler does N stat calls on the event loop.
def get_model_status(noasync) runs on the threadpool — fine — but if you switch it toasync deflater, theos.path.existscalls would block the loop. Either keep it sync (current state is OK) or, if changing to async, dispatch the loop viaasyncio.to_thread. No change needed today; just be aware as the registry grows.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routes/models.py` around lines 49 - 68, The current get_model_status function performs blocking os.path.exists calls in a synchronous handler; if you ever convert get_model_status to async, avoid blocking the event loop by running filesystem checks in a thread using asyncio.to_thread (e.g. wrap get_model_path/key existence checks), or simply keep the function synchronous; update calls that reference MODEL_REGISTRY and get_model_path accordingly so all os.path.exists checks are dispatched via asyncio.to_thread when converting to async.
22-44: ⚖️ Poor tradeoff
download_tasksis shared mutable state without synchronization.
download_tasksis a module-global dict mutated from multiple coroutines (setup_models,start_download_model,event_generator'sfinally,_cleanup_stale_tasks). FastAPI under a single-worker event loop is OK because dict ops complete in one bytecode step, but if the deployment ever uses uvicorn with--workers > 1, each worker has its own dict and SSE progress for atask_idstarted on worker A is unfindable from worker B.If multi-worker deployment is in scope, you'll need a shared backing store (Redis, etc.). For now, document the single-worker assumption and assert it on startup.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routes/models.py` around lines 22 - 44, The module-global download_tasks dict (used by download_tasks, _cleanup_stale_tasks, setup_models, start_download_model, and event_generator) is unsafe for multi-worker deployments; add a clear runtime assertion and docstring that we require a single worker: on application startup check common env vars (e.g. UVICORN_WORKERS, WEB_CONCURRENCY) and/or a configured worker count and raise a RuntimeError if they indicate >1 (or log and exit), and update the module docstring to state the single-worker assumption and that a shared backing store (Redis) is required to support multiple workers in the future.backend/app/models/FaceNet.py (1)
12-53: ⚖️ Poor tradeoffDuplicated lazy-session-with-registry-lookup logic across
FaceNetandYOLO.The "fast-path read → lock → existence check → registry filename lookup → InferenceSession build" sequence here is essentially the same code as
YOLO.get_session(). Consider extracting a small mixin/helper (e.g.,_build_onnx_session(model_path)that raises the friendlyRuntimeErrorand returns the session) so both classes share one implementation. Same comment about substring matching (spec["filename"] in self.model_path) applies — prefer equality againstos.path.basename(self.model_path).As per coding guidelines: "Look for code duplication."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/FaceNet.py` around lines 12 - 53, FaceNet.get_session duplicates the lazy-session-with-registry-lookup logic also present in YOLO.get_session; extract a shared helper (e.g., a module-level function or mixin named _build_onnx_session(model_path)) that encapsulates the "check file exists → lookup MODEL_REGISTRY → raise friendly RuntimeError → create onnxruntime.InferenceSession and return it" logic, then have FaceNet.get_session and YOLO.get_session call that helper and set input_tensor_name/output_tensor_name from the returned session; also change the registry lookup from substring matching (spec["filename"] in self.model_path) to exact equality against os.path.basename(self.model_path) to avoid false matches.docs/backend/backend_python/openapi.json (1)
1559-1761: ⚖️ Poor tradeoff
/models/*endpoints emit empty response schemas.All six new
/modelsendpoints have"schema": {}for the 200 body, because the corresponding FastAPI handlers inbackend/app/routes/models.pyreturn rawdicts without aresponse_model=. This degrades the generated OpenAPI quality (no client-side typing, no validation). Worth adding Pydantic response models (e.g.,ModelStatusResponse,HardwareInfoResponse,SetupResponse,DeleteModelResponse,DownloadStartResponse) and wiring them withresponse_model=...so the spec — and downstream tooling — gets full types.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/backend/backend_python/openapi.json` around lines 1559 - 1761, The /models/* endpoints currently emit empty response schemas because the FastAPI handlers in backend/app/routes/models.py return raw dicts without response_model; define Pydantic models (e.g., ModelStatusResponse, HardwareInfoResponse, SetupResponse, DeleteModelResponse, DownloadStartResponse, DownloadProgressResponse) in the existing schemas module (or create backend/app/schemas/models.py) and update each route decorator/endpoint in models.py to include response_model=<appropriate_model> so FastAPI generates proper OpenAPI types and validation (match each operationId: get_model_status_models_status_get, get_hardware_recommendation_models_hardware_get, setup_models_models_setup_post, delete_model_models__model_key__delete, start_download_model_models_download__model_key__post, download_progress_models_download__task_id__progress_get).backend/app/utils/model_downloader.py (1)
28-40: 💤 Low valueUse lazy
%-style logging, not f-strings.
logger.info(f"Model {model_key} ...")(and several siblings) eagerly formats even when the level is filtered out. Switch tologger.info("Model %s ...", model_key). This is a Python best-practice nit but applies uniformly throughout this new file.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/utils/model_downloader.py` around lines 28 - 40, The logger calls in this module (e.g., inside ensure_model and any sibling functions) currently use f-strings which eagerly format messages; replace them with lazy %-style logging by changing statements like logger.info(f"...{model_key}...") to logger.info("...%s...", model_key) (and similarly for logger.debug/warning/error calls), ensuring all log calls in model_downloader.py use format string plus arguments with the module-level logger name (logger) rather than f-strings so message interpolation is deferred when the log level is disabled.frontend/src/types/models.ts (1)
61-71: 💤 Low valueHardcoded tier sizes can drift from the backend registry.
The
~21 MB/~75.7 MB/~161 MBnumbers are baked intoMODEL_TIER_LABELShere, but the authoritativesize_mblives inbackend/app/models/model_registry.pyand is also returned via/models/statusperModelEntry. If the backend bumps a tier's model size, this label will silently be wrong. Consider deriving the displayed size from the live/models/statuspayload (sum of object + face detection modelsize_mbfor the tier), or at minimum adding a comment cross-referencing the registry as the source of truth.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/src/types/models.ts` around lines 61 - 71, Replace the hardcoded size strings in MODEL_TIER_LABELS by deriving the displayed size from the backend /models/status ModelEntry payload (sum of object + face detection ModelEntry.size_mb for each tier) at runtime: update wherever MODEL_TIER_LABELS is consumed to compute label text using the live sizes (e.g., "Nano (~{size} MB)") instead of fixed values, falling back to the existing strings in MODEL_TIER_DESCRIPTIONS if the API is unavailable; reference the frontend consumers of MODEL_TIER_LABELS and the backend ModelEntry returned by /models/status to locate where to fetch and format the sizes and ensure the formatted string replaces the current hardcoded entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/build-and-release.yml:
- Around line 32-35: After the PyInstaller step that runs "cd backend &&
pyinstaller PictoPy.spec", add a post-build verification step that scans the
built artifact directory backend/dist/PictoPy_Server for any files matching
*.onnx and fails the job if any are found; implement this check as a separate
job step (named similarly to "Verify no .onnx artifacts") that returns
non‑zero/aborts on a match so the workflow fails, and repeat the same
verification step for the equivalent post-build locations referenced around the
other build stages (the sections corresponding to lines ~67-70 and ~100-103) to
ensure no .onnx files are bundled anywhere.
In `@backend/app/models/FaceNet.py`:
- Line 15: The PEP 604 union annotation on self._session
(onnxruntime.InferenceSession | None) requires Python 3.10+; either declare the
minimum Python version in packaging or make annotations forward-compatible: add
requires-python = ">=3.10" to backend/pyproject.toml, or add from __future__
import annotations at the top of backend/app/models/FaceNet.py (and any other
files using the `|` type syntax) so the annotations are treated as strings and
work on older interpreters.
In `@backend/app/routes/models.py`:
- Around line 230-255: The SSE event_generator uses
asyncio.wait_for(entry.queue.get(), timeout=60.0) which makes heartbeats only
every 60s and risks reverse-proxy timeouts; change the heartbeat timeout to
15–20s (e.g., 15s) in event_generator to emit frequent heartbeats and avoid
proxy disconnects, and prevent multiple consumers from draining the same
entry.queue by either rejecting new connections when an active listener exists
(check a flag like entry.listener_active or entry.listener_id before accepting a
second /download/{task_id}/progress request) or create a per-listener queue when
a new client connects and fan out messages from the single producer to each
listener queue; also ensure to clear the listener flag/queue in the finally
block where download_tasks.pop(task_id, None) is executed.
- Around line 88-111: The delete_model handler currently unconditionally removes
the model file; change it to first check a live-session registry (e.g., a global
dict mapping model_key -> list/instance of InferenceSession used by
FaceNet/YOLO) and if any active sessions exist return HTTP 409 CONFLICT with a
clear message (or, if desired, call .close() on those InferenceSession objects
before proceeding); use the MODEL_REGISTRY and get_model_path symbols to locate
the model, and replace the blocking os.remove(path) with await
asyncio.to_thread(os.remove, path) to avoid blocking the event loop; ensure the
error handling still returns 500 on unexpected failures and that the response
messages reference model_key.
- Around line 132-168: Concurrent downloads can race and corrupt the same
model_path because background_setup/ensure_model may run in parallel for the
same model_key; add deduplication by tracking in-flight downloads per model_key
and reusing or serializing them. Modify the logic that iterates
models_to_download / calls ensure_model so it first checks a new in_flight dict
(map model_key -> DownloadTaskEntry or asyncio.Lock) and if an entry exists
returns the existing task_id (or awaits the existing task) instead of spawning a
duplicate task; alternatively create a per-key asyncio.Lock (e.g.,
in_flight_locks: Dict[str, asyncio.Lock]) and wrap the ensure_model call with
async with in_flight_locks[model_key]: to serialize access, and ensure you clean
up the in_flight entry/lock on completion or error; update download_tasks and
DownloadTaskEntry usage so task lookup/cleanup is consistent.
In `@backend/app/utils/model_downloader.py`:
- Around line 86-98: The progress percent calculation can exceed 100 if the
server misreports Content-Length; update the two places where percent is
computed (inside the async download loops that check if progress_callback and
total_size > 0) to clamp the value before calling progress_callback (e.g.,
compute percent = (downloaded / total_size) * 100 then set percent =
min(percent, 100.0)); apply this change to both progress sites in the async
chunk loops that use downloaded, total_size, progress_callback, and response.
- Around line 64-98: The 416 branch currently nests a fresh client.stream inside
an active outer client.stream and also calls response.raise_for_status() after
deriving mode/total_size/downloaded; update the logic in model_downloader.py so
you first call response.raise_for_status() (or branch on response.status_code)
immediately after acquiring the outer response, then if response.status_code ==
416 close/exit the outer stream before starting a new client.stream("GET", url)
for a full download, remove the partial file (model_path) and perform the fresh
download into "wb" while reporting progress; ensure you still compute
total_size/downloaded only after opening the appropriate stream and not while
the outer response remains open.
In `@docs/Manual_Setup_Guide.md`:
- Around line 49-59: Add language identifiers ("bash") to the three fenced code
blocks containing the commands "cd frontend", "npm install", and "npm run tauri
dev" so the markdown shows as ```bash ... ``` instead of ``` ... ```; update
those fenced blocks in Manual_Setup_Guide.md to silence MD040 warnings and
ensure proper syntax highlighting.
In `@docs/Script_Setup_Guide.md`:
- Around line 45-46: Fix the Markdown blockquote by removing the blank line
inside the quoted paragraph that contains "The setup script installs the
CPU-only `onnxruntime` package for local development. The release workflow swaps
in platform-specific GPU providers, and the onboarding recommendation uses
hardware detection (rather than ONNX Runtime provider detection)." — ensure the
blockquote lines are contiguous (no empty line between them) so markdownlint
MD028 is not triggered.
In `@frontend/src/features/onboardingSlice.ts`:
- Around line 47-48: Compute the index once from state.stepStatus
(state.currentStepIndex = state.stepStatus.findIndex(s => !s)); then if that
index is -1 set state.currentStepName to the explicit completed terminal step
(e.g. STEPS.COMPLETED or whatever terminal value your STEPS enum uses) instead
of falling back to STEPS.SERVER_CHECK; otherwise set state.currentStepName =
STEP_NAMES[state.currentStepIndex]. This ensures the fully-completed case is
represented as completed rather than remapped to SERVER_CHECK; update the logic
around state.currentStepIndex and state.currentStepName accordingly.
---
Outside diff comments:
In `@backend/app/routes/folders.py`:
- Around line 235-245: enable_ai_tagging currently updates DB and schedules
post_AI_tagging_enabled_sequence in the background but doesn't verify models are
available; perform the model preflight synchronously before returning success by
calling the model-check/preflight routine (or invoking
post_AI_tagging_enabled_sequence synchronously or a dedicated
ensure_models_available() helper) and raise/return an error if the preflight
fails, then only call db_enable_ai_tagging_batch and schedule the background
executor (app_state.executor.submit) after the preflight passes; reference
enable_ai_tagging, db_enable_ai_tagging_batch, post_AI_tagging_enabled_sequence,
and app_state.executor when making the change.
---
Nitpick comments:
In `@backend/app/config/settings.py`:
- Around line 6-11: Consolidate the frozen-vs-dev model path logic into a single
helper in settings by exposing a single computed symbol (e.g.,
MODEL_EXPORTS_PATH or a function get_model_exports_path()) that encapsulates the
getattr(sys, "frozen", False) branch; then remove the duplicate branch in
backend/app/models/model_registry.py and import and use that settings symbol
instead so both model download and load paths reference the same source of truth
(replace any local path computation in model_registry.py with
MODEL_EXPORTS_PATH/get_model_exports_path()).
In `@backend/app/models/FaceNet.py`:
- Around line 12-53: FaceNet.get_session duplicates the
lazy-session-with-registry-lookup logic also present in YOLO.get_session;
extract a shared helper (e.g., a module-level function or mixin named
_build_onnx_session(model_path)) that encapsulates the "check file exists →
lookup MODEL_REGISTRY → raise friendly RuntimeError → create
onnxruntime.InferenceSession and return it" logic, then have FaceNet.get_session
and YOLO.get_session call that helper and set
input_tensor_name/output_tensor_name from the returned session; also change the
registry lookup from substring matching (spec["filename"] in self.model_path) to
exact equality against os.path.basename(self.model_path) to avoid false matches.
In `@backend/app/models/YOLO.py`:
- Around line 22-24: The class __init__ currently performs a local import
("import threading") and then uses threading.Lock() to set self._lock, causing
repeated imports; move the import of threading to the module top-level
(alongside existing imports like onnxruntime) and remove the in-method import so
__init__ only does self._lock = threading.Lock(), updating YOLO.__init__
accordingly.
- Around line 33-46: Move the imports "import os" and "from
app.models.model_registry import MODEL_REGISTRY" out of get_session() to the
module top-level, and update the model lookup in get_session() to compare
os.path.basename(self.model_path) for equality against spec["filename"] (instead
of using the substring check spec["filename"] in self.model_path) so
MODEL_REGISTRY key selection (model_key) is exact; keep raising the same
RuntimeError with model_name derived from the matched key or basename when not
found.
- Around line 84-93: The getters get_input_details() and get_output_details()
read self._session directly and can raise AttributeError if called after
close(); change them to obtain the session via get_session() (or at minimum
assert/check it's not None) before accessing inputs/outputs: call sess =
self.get_session() (or if you prefer an explicit guard, raise a clear runtime
error when self._session is None) and then use
sess.get_inputs()/sess.get_outputs(); update references to self._session in
get_input_details and get_output_details and add a short docstring noting they
rely on an active session.
- Around line 60-64: The YOLO.close() method currently only clears self._session
under the lock, leaving self.input_names, self.output_names, and
self.input_shape set; update YOLO.close() to mirror FaceNet.close() by resetting
self.input_names, self.output_names, and self.input_shape (inside the same with
self._lock block) when clearing the session so the internal state is consistent
and no dangling tensor metadata remains.
In `@backend/app/routes/models.py`:
- Line 5: The import list in models.py includes an unused symbol Optional from
typing; remove Optional from the import statement so only used types remain
(e.g., change "from typing import Dict, Optional" to "from typing import Dict")
and run linters/tests to confirm no other references to Optional exist in
functions or classes in this module.
- Around line 49-68: The current get_model_status function performs blocking
os.path.exists calls in a synchronous handler; if you ever convert
get_model_status to async, avoid blocking the event loop by running filesystem
checks in a thread using asyncio.to_thread (e.g. wrap get_model_path/key
existence checks), or simply keep the function synchronous; update calls that
reference MODEL_REGISTRY and get_model_path accordingly so all os.path.exists
checks are dispatched via asyncio.to_thread when converting to async.
- Around line 22-44: The module-global download_tasks dict (used by
download_tasks, _cleanup_stale_tasks, setup_models, start_download_model, and
event_generator) is unsafe for multi-worker deployments; add a clear runtime
assertion and docstring that we require a single worker: on application startup
check common env vars (e.g. UVICORN_WORKERS, WEB_CONCURRENCY) and/or a
configured worker count and raise a RuntimeError if they indicate >1 (or log and
exit), and update the module docstring to state the single-worker assumption and
that a shared backing store (Redis) is required to support multiple workers in
the future.
In `@backend/app/utils/hardware_detect.py`:
- Around line 64-71: get_hardware_info currently calls detect_physical_gpu() to
populate gpu_names but then calls detect_hardware_tier() which re-probes GPUs;
change the flow so detect_hardware_tier reuses the already-computed gpu_names
instead of re-running detection: update detect_hardware_tier to accept an
optional gpu_names/list parameter (or add a new helper
detect_hardware_tier_from_list) and call it from get_hardware_info with the
gpu_names variable to compute recommended_tier, avoiding the duplicate
detect_physical_gpu() subprocess call and ensuring consistent results.
In `@backend/app/utils/model_downloader.py`:
- Around line 28-40: The logger calls in this module (e.g., inside ensure_model
and any sibling functions) currently use f-strings which eagerly format
messages; replace them with lazy %-style logging by changing statements like
logger.info(f"...{model_key}...") to logger.info("...%s...", model_key) (and
similarly for logger.debug/warning/error calls), ensuring all log calls in
model_downloader.py use format string plus arguments with the module-level
logger name (logger) rather than f-strings so message interpolation is deferred
when the log level is disabled.
In `@docs/backend/backend_python/openapi.json`:
- Around line 1559-1761: The /models/* endpoints currently emit empty response
schemas because the FastAPI handlers in backend/app/routes/models.py return raw
dicts without response_model; define Pydantic models (e.g., ModelStatusResponse,
HardwareInfoResponse, SetupResponse, DeleteModelResponse, DownloadStartResponse,
DownloadProgressResponse) in the existing schemas module (or create
backend/app/schemas/models.py) and update each route decorator/endpoint in
models.py to include response_model=<appropriate_model> so FastAPI generates
proper OpenAPI types and validation (match each operationId:
get_model_status_models_status_get,
get_hardware_recommendation_models_hardware_get, setup_models_models_setup_post,
delete_model_models__model_key__delete,
start_download_model_models_download__model_key__post,
download_progress_models_download__task_id__progress_get).
In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx`:
- Around line 85-110: The model-status check in useEffect (checkModelStatus) can
hang indefinitely; wrap the fetch with an AbortController and a timeout so the
request is cancelled on cleanup or after ~5–10s: create an AbortController, pass
its signal to fetch in checkModelStatus, start a timeout that calls
controller.abort() after the chosen duration, clear the timeout and call
controller.abort() in the useEffect cleanup, and ensure
setIsCheckingStatus(false) and dispatch(markCompleted(stepIndex)) still run
appropriately when aborted or timed out.
- Around line 144-156: Add an explicit response type for the /models/setup call
by creating a SetupResponse interface in frontend/src/types/models.ts (fields:
success: boolean, detail?: string, task_id?: string) and import it into
AIModelSetupStep.tsx; then type the parsed JSON as SetupResponse (e.g., const
data: SetupResponse = await response.json()) so accesses to data.success /
data.detail / data.task_id are statically typed and conform to the Action
contract.
- Around line 88-89: The fetch call uses await res.json() and casts it to
ModelStatusResponse (const data: ModelStatusResponse = await res.json()), which
is unsafe; add a runtime shape check for the /models/status (and similarly
/models/hardware) response before using data.data — e.g., define a lightweight
validator (Zod schema or a hand-written type predicate) that asserts the
top-level object has the expected fields (e.g., data: Array with required
properties), parse/validate the parsed JSON (instead of blind casting), and if
validation fails handle the error path (log/show an error and avoid iterating in
getInstalledModelTiers), ensuring BACKEND_URL fetch handling checks res.ok and
uses the validated value thereafter.
- Around line 113-136: The effect currently refetches on every viewMode change
and always overwrites selectedTier and also closes eventSourceRef on every dep
change; change it so the hardware fetch only runs once (e.g., move the fetch
into a useEffect with empty deps or guard with a fetchedRef) and store the
recommended tier in state via setRecommendedTier without unconditionally calling
setSelectedTier if the user has already changed the dropdown (use a userChanged
flag or only setSelectedTier when selectedTier is undefined/default). Also
extract the SSE cleanup into its own unmount-only useEffect that returns () => {
if (eventSourceRef.current) eventSourceRef.current.close(); } so eventSourceRef
is closed only on unmount and not on every viewMode change; keep references to
useEffect, viewMode, fetch(.../models/hardware), setRecommendedTier,
setSelectedTier, and eventSourceRef to locate the changes.
In `@frontend/src/types/models.ts`:
- Around line 61-71: Replace the hardcoded size strings in MODEL_TIER_LABELS by
deriving the displayed size from the backend /models/status ModelEntry payload
(sum of object + face detection ModelEntry.size_mb for each tier) at runtime:
update wherever MODEL_TIER_LABELS is consumed to compute label text using the
live sizes (e.g., "Nano (~{size} MB)") instead of fixed values, falling back to
the existing strings in MODEL_TIER_DESCRIPTIONS if the API is unavailable;
reference the frontend consumers of MODEL_TIER_LABELS and the backend ModelEntry
returned by /models/status to locate where to fetch and format the sizes and
ensure the formatted string replaces the current hardcoded entries.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e7009d00-6d7a-4689-9598-f4a537822b68
📒 Files selected for processing (31)
.github/workflows/build-and-release.yml.gitignorebackend/.gitignorebackend/PictoPy.specbackend/app/config/settings.pybackend/app/models/FaceNet.pybackend/app/models/ONNX_Exports/FaceNet_128D.onnxbackend/app/models/ONNX_Exports/YOLOv11_Medium.onnxbackend/app/models/ONNX_Exports/YOLOv11_Medium_Face.onnxbackend/app/models/ONNX_Exports/YOLOv11_Nano.onnxbackend/app/models/ONNX_Exports/YOLOv11_Nano_Face.onnxbackend/app/models/ONNX_Exports/YOLOv11_Small.onnxbackend/app/models/ONNX_Exports/YOLOv11_Small_Face.onnxbackend/app/models/YOLO.pybackend/app/models/model_registry.pybackend/app/routes/folders.pybackend/app/routes/models.pybackend/app/utils/hardware_detect.pybackend/app/utils/model_bootstrap.pybackend/app/utils/model_downloader.pybackend/main.pybackend/requirements.txtdocs/Manual_Setup_Guide.mddocs/Script_Setup_Guide.mddocs/backend/backend_python/openapi.jsonfrontend/src/components/OnboardingSteps/AIModelSetupStep.tsxfrontend/src/components/OnboardingSteps/OnboardingStep.tsxfrontend/src/constants/steps.tsfrontend/src/features/onboardingSlice.tsfrontend/src/pages/SettingsPage/components/UserPreferencesCard.tsxfrontend/src/types/models.ts
| > The setup script installs the CPU-only `onnxruntime` package for local development. The release workflow swaps in platform-specific GPU providers, and the onboarding recommendation uses hardware detection (rather than ONNX Runtime provider detection). | ||
|
|
There was a problem hiding this comment.
Fix blockquote formatting to avoid MD028 warning.
There is a blank line inside a blockquote sequence; markdownlint will flag this.
Suggested fix
- > The setup script installs the CPU-only `onnxruntime` package for local development. The release workflow swaps in platform-specific GPU providers, and the onboarding recommendation uses hardware detection (rather than ONNX Runtime provider detection).
-
+ > The setup script installs the CPU-only `onnxruntime` package for local development. The release workflow swaps in platform-specific GPU providers, and the onboarding recommendation uses hardware detection (rather than ONNX Runtime provider detection).🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 46-46: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/Script_Setup_Guide.md` around lines 45 - 46, Fix the Markdown blockquote
by removing the blank line inside the quoted paragraph that contains "The setup
script installs the CPU-only `onnxruntime` package for local development. The
release workflow swaps in platform-specific GPU providers, and the onboarding
recommendation uses hardware detection (rather than ONNX Runtime provider
detection)." — ensure the blockquote lines are contiguous (no empty line between
them) so markdownlint MD028 is not triggered.
- Add an explicit artifact check that fails if any .onnx file is still bundled.
- Use from __future__ import annotations for Python <3.9 compatibility.
- DELETE /{model_key} delete guard against active sessions.
- De-duplication for concurrent /setup or /download/{model_key} for the same model.
- SSE heartbeat interval.
- Nested HTTP streams on 416 raise_for_status() ordering.
- Fix divide-by-zero edge case.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/models/FaceNet.py`:
- Around line 28-66: get_session() can return a session that later has its
input/output names cleared by close(); capture the session and its tensor names
while holding self._lock and return them together (e.g., return (session,
input_name, output_name)) or snapshot self.input_tensor_name and
self.output_tensor_name into locals before releasing the lock, then update
get_embedding() to call get_session() and use the returned/snapshotted names
instead of reading attributes after get_session() returns; reference
get_session, get_embedding, self._lock, self.input_tensor_name and
self.output_tensor_name when making this change.
In `@backend/app/routes/models.py`:
- Around line 104-306: Add async tests covering delete_model (conflict when
get_active_session_count > 0 and successful deletion), per-model download
serialization using _get_model_download_lock and ensure_model to assert
concurrent downloads for same model are serialized, SSE single-listener behavior
and reconnect/cleanup via download_progress and DownloadTaskEntry/download_tasks
ensuring only one active listener is allowed and tasks are removed after
completion; implement these tests in the async test suite (use
TestClient/AsyncClient) to start background tasks, push progress messages to the
queue, simulate timeouts/heartbeats, assert HTTP 409 for second listener and
that task entries are popped on normal completion and remain when cancelled.
- Around line 115-128: The delete path uses a snapshot via
get_active_session_count and then removes the file (await
asyncio.to_thread(os.remove, path)) which races with concurrent session creation
(e.g., FaceNet.get_session / YOLO.get_session); instead, acquire the same
per-model lifecycle lock used by session creation before checking existence and
deleting: locate the lock manager used by session creation, wrap the sequence in
its lock (get_model_path -> check os.path.exists -> remove) so no new
InferenceSession can be created while deletion proceeds, and release the lock
afterwards.
In `@backend/PictoPy.spec`:
- Around line 8-29: Remove the broad Python package copy from PyInstaller spec
by deleting or changing the datas definition and its application: remove the
datas entry that sets datas=[('app','app')] and stop assigning a.datas =
exclude_models(a.datas); instead, restrict datas to only explicit non-Python
runtime assets (e.g., config or data file tuples) and if needed keep a small
whitelist rather than copying the entire 'app' package; update or remove the
exclude_models(...) helper (function exclude_models and its call a.datas =
exclude_models(a.datas)) so PyInstaller no longer redundantly bundles the 'app'
package.
In `@frontend/src/features/onboardingSlice.ts`:
- Line 8: The code relies on STEP_NAMES[STEP_NAMES.length - 1] to derive
TERMINAL_STEP_NAME which implicitly depends on the insertion order of STEPS;
change this to use an explicit terminal identifier (e.g., add and reference a
named constant like STEPS.TERMINAL or STEPS.UPDATE_STEP) or add a runtime
assertion that STEP_NAMES[last] equals the expected terminal key and document
the ordering requirement; update all uses of TERMINAL_STEP_NAME accordingly and
reference the symbols TERMINAL_STEP_NAME, STEP_NAMES and STEPS when making the
change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f74877e3-54f8-4fe8-b742-fcbb5165b33d
📒 Files selected for processing (20)
.github/workflows/build-and-release.ymlbackend/PictoPy.specbackend/app/models/FaceNet.pybackend/app/models/ObjectClassifier.pybackend/app/models/YOLO.pybackend/app/models/model_registry.pybackend/app/models/session_registry.pybackend/app/routes/models.pybackend/app/schemas/images.pybackend/app/schemas/test.pybackend/app/utils/hardware_detect.pybackend/app/utils/images.pybackend/app/utils/model_downloader.pydocs/Manual_Setup_Guide.mddocs/Script_Setup_Guide.mdfrontend/src/components/OnboardingSteps/AIModelSetupStep.tsxfrontend/src/components/OnboardingSteps/OnboardingStep.tsxfrontend/src/features/onboardingSlice.tsfrontend/src/pages/SettingsPage/components/UserPreferencesCard.tsxfrontend/src/types/models.ts
✅ Files skipped from review due to trivial changes (2)
- backend/app/models/ObjectClassifier.py
- backend/app/utils/images.py
🚧 Files skipped from review as they are similar to previous changes (9)
- backend/app/models/model_registry.py
- frontend/src/types/models.ts
- backend/app/models/YOLO.py
- frontend/src/pages/SettingsPage/components/UserPreferencesCard.tsx
- backend/app/utils/hardware_detect.py
- backend/app/utils/model_downloader.py
- .github/workflows/build-and-release.yml
- frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx
- frontend/src/components/OnboardingSteps/OnboardingStep.tsx
- snapshot tensor names to avoid close() invalidating session.run() inputs. - lock model deletion to prevent delete-while-in-use races
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
backend/app/models/session_registry.py (3)
1-7: ⚡ Quick winAdd module docstring to document purpose and threading model.
The module lacks a docstring explaining its role in preventing model deletion races and its thread-safety guarantees. This would help future maintainers understand the concurrency model and the relationship between active sessions and pending deletions.
📝 Suggested module docstring
from __future__ import annotations +""" +Thread-safe registry for tracking active ONNX model sessions and pending deletions. + +Prevents race conditions where a model is deleted while sessions are still active. +All operations are guarded by a module-level lock to ensure atomicity. +""" + import threading🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/session_registry.py` around lines 1 - 7, Add a module-level docstring at the top of session_registry.py that states the module's purpose (tracking active sessions to prevent model deletion races), describes the threading model and concurrency guarantees, and documents the shared globals _active_sessions, _models_pending_deletion and the use of _registry_lock to protect them; keep it concise, mention that all access to _active_sessions and _models_pending_deletion must be performed while holding _registry_lock (or via provided helper functions) and note expected types (dict[str,int], set[str]) for maintainers.
10-44: ⚡ Quick winAdd docstrings to all public functions.
All five public functions lack docstrings. Given that this module provides a critical concurrency primitive, documenting the expected behavior, parameters, return values, and exceptions would significantly improve maintainability.
📝 Suggested function docstrings
def mark_model_session_active(model_key: str) -> None: + """ + Mark a model session as active, incrementing its reference count. + + Args: + model_key: Unique identifier for the model. + + Raises: + RuntimeError: If the model is marked for deletion. + """ with _registry_lock:def mark_model_session_inactive(model_key: str) -> None: + """ + Mark a model session as inactive, decrementing its reference count. + + Removes the model from the active sessions map when count reaches zero. + Safe to call even if count is already zero (no-op). + + Args: + model_key: Unique identifier for the model. + """ with _registry_lock:def get_active_session_count(model_key: str) -> int: + """ + Get the current count of active sessions for a model. + + Args: + model_key: Unique identifier for the model. + + Returns: + Number of active sessions, or 0 if none. + """ with _registry_lock:def try_mark_model_for_deletion(model_key: str) -> int | None: + """ + Attempt to mark a model for deletion. + + Args: + model_key: Unique identifier for the model. + + Returns: + None if successfully marked for deletion (no active sessions). + Active session count if deletion cannot proceed (sessions exist). + """ with _registry_lock:def release_model_deletion_mark(model_key: str) -> None: + """ + Remove the deletion mark from a model. + + Safe to call even if the model is not marked for deletion. + + Args: + model_key: Unique identifier for the model. + """ with _registry_lock:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/session_registry.py` around lines 10 - 44, Add comprehensive docstrings to each public function (mark_model_session_active, mark_model_session_inactive, get_active_session_count, try_mark_model_for_deletion, release_model_deletion_mark) describing purpose/behavior, parameters and their types, return values and types, raised exceptions (e.g., RuntimeError when attempting to start a session for a model pending deletion), and concurrency/thread-safety guarantees (that they are protected by _registry_lock). Keep them concise, follow project docstring style (short summary + details), and ensure they mention side effects on _active_sessions and _models_pending_deletion where applicable.
19-25: 💤 Low valueConsider documenting or validating the edge case when count is 0.
The function silently handles the case where
mark_model_session_inactiveis called when no sessions are active (count == 0). While this is safe due to the defensivepop(model_key, None), it may indicate a programming error in the calling code (mismatched active/inactive calls).Consider either:
- Adding a comment explaining that count <= 1 handles both "last session" (count==1) and "defensive no-op" (count==0) cases, or
- Adding validation that raises an exception if count is 0, which would help catch bugs in session lifecycle management.
The current implementation is safe but may hide bugs in FaceNet/YOLO close() logic if those are called multiple times.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/models/session_registry.py` around lines 19 - 25, mark_model_session_inactive currently treats count <= 1 the same and silently no-ops when count == 0; update the function (mark_model_session_inactive) to detect the count==0 case and raise a clear exception (e.g., ValueError("mark_model_session_inactive called with no active sessions for {model_key}")) to catch mismatched open/close calls, while keeping the existing lock (_registry_lock) and decrement/pop logic for count==1; also add a one-line comment above the check explaining that count==1 removes the last session and count==0 is considered a programming error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@backend/app/models/session_registry.py`:
- Around line 1-44: Add comprehensive unit tests for session_registry covering
mark_model_session_active, mark_model_session_inactive,
get_active_session_count, try_mark_model_for_deletion, and
release_model_deletion_mark: test basic increment/decrement semantics (multiple
mark_model_session_active increases count, mark_model_session_inactive decreases
and removes key when zero, extra inactive calls are no-ops), test that
try_mark_model_for_deletion returns active count when sessions exist and adds
model to _models_pending_deletion when none (and release_model_deletion_mark
removes that mark), and add concurrency tests using threading to spawn many
concurrent mark_model_session_active/mark_model_session_inactive calls to assert
final counts and that try_mark_model_for_deletion correctly detects active
sessions under race; include an explicit test that attempting to
mark_model_session_active while a model is pending deletion raises RuntimeError.
Ensure tests reset global registry state between cases.
---
Nitpick comments:
In `@backend/app/models/session_registry.py`:
- Around line 1-7: Add a module-level docstring at the top of
session_registry.py that states the module's purpose (tracking active sessions
to prevent model deletion races), describes the threading model and concurrency
guarantees, and documents the shared globals _active_sessions,
_models_pending_deletion and the use of _registry_lock to protect them; keep it
concise, mention that all access to _active_sessions and
_models_pending_deletion must be performed while holding _registry_lock (or via
provided helper functions) and note expected types (dict[str,int], set[str]) for
maintainers.
- Around line 10-44: Add comprehensive docstrings to each public function
(mark_model_session_active, mark_model_session_inactive,
get_active_session_count, try_mark_model_for_deletion,
release_model_deletion_mark) describing purpose/behavior, parameters and their
types, return values and types, raised exceptions (e.g., RuntimeError when
attempting to start a session for a model pending deletion), and
concurrency/thread-safety guarantees (that they are protected by
_registry_lock). Keep them concise, follow project docstring style (short
summary + details), and ensure they mention side effects on _active_sessions and
_models_pending_deletion where applicable.
- Around line 19-25: mark_model_session_inactive currently treats count <= 1 the
same and silently no-ops when count == 0; update the function
(mark_model_session_inactive) to detect the count==0 case and raise a clear
exception (e.g., ValueError("mark_model_session_inactive called with no active
sessions for {model_key}")) to catch mismatched open/close calls, while keeping
the existing lock (_registry_lock) and decrement/pop logic for count==1; also
add a one-line comment above the check explaining that count==1 removes the last
session and count==0 is considered a programming error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d8a26501-adcd-4eb2-a57a-380e45098f90
📒 Files selected for processing (6)
backend/app/models/FaceNet.pybackend/app/models/YOLO.pybackend/app/models/model_registry.pybackend/app/models/session_registry.pybackend/app/routes/models.pybackend/app/utils/model_bootstrap.py
🚧 Files skipped from review as they are similar to previous changes (5)
- backend/app/utils/model_bootstrap.py
- backend/app/models/FaceNet.py
- backend/app/models/model_registry.py
- backend/app/models/YOLO.py
- backend/app/routes/models.py
| from __future__ import annotations | ||
|
|
||
| import threading | ||
|
|
||
| _active_sessions: dict[str, int] = {} | ||
| _models_pending_deletion: set[str] = set() | ||
| _registry_lock = threading.Lock() | ||
|
|
||
|
|
||
| def mark_model_session_active(model_key: str) -> None: | ||
| with _registry_lock: | ||
| if model_key in _models_pending_deletion: | ||
| raise RuntimeError( | ||
| f"Model '{model_key}' is being deleted; cannot start a new session." | ||
| ) | ||
| _active_sessions[model_key] = _active_sessions.get(model_key, 0) + 1 | ||
|
|
||
|
|
||
| def mark_model_session_inactive(model_key: str) -> None: | ||
| with _registry_lock: | ||
| count = _active_sessions.get(model_key, 0) | ||
| if count <= 1: | ||
| _active_sessions.pop(model_key, None) | ||
| return | ||
| _active_sessions[model_key] = count - 1 | ||
|
|
||
|
|
||
| def get_active_session_count(model_key: str) -> int: | ||
| with _registry_lock: | ||
| return _active_sessions.get(model_key, 0) | ||
|
|
||
|
|
||
| def try_mark_model_for_deletion(model_key: str) -> int | None: | ||
| with _registry_lock: | ||
| count = _active_sessions.get(model_key, 0) | ||
| if count > 0: | ||
| return count | ||
| _models_pending_deletion.add(model_key) | ||
| return None | ||
|
|
||
|
|
||
| def release_model_deletion_mark(model_key: str) -> None: | ||
| with _registry_lock: | ||
| _models_pending_deletion.discard(model_key) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Search for test files covering session_registry.py
# Search for test files that import or reference session_registry
rg -l 'session_registry' --glob '*test*.py'
# Also check for any test files in backend/tests or backend/app/tests
fd -t f 'test.*\.py$' backend/Repository: AOSSIE-Org/PictoPy
Length of output: 293
🏁 Script executed:
# Search in test files for session_registry references
rg 'session_registry|_active_sessions|mark_model_session' backend/tests/ backend/test.py --type py
# Also check if session_registry is imported anywhere in the codebase
rg 'from.*session_registry|import.*session_registry' --type py
# Check what imports this module
rg 'mark_model_session_active|mark_model_session_inactive|get_active_session_count|try_mark_model_for_deletion|release_model_deletion_mark' --type pyRepository: AOSSIE-Org/PictoPy
Length of output: 1669
🏁 Script executed:
# Check if YOLO.py or FaceNet.py are tested
rg 'test.*yolo|test.*facenet|YOLO|FaceNet' backend/tests/ -i
# Check conftest.py and test files to see what they test
cat -n backend/tests/conftest.py
# List what each test file tests
head -20 backend/tests/test_albums.py
head -20 backend/tests/test_face_clusters.py
head -20 backend/tests/test_folders.py
head -20 backend/tests/test_user_preferences.pyRepository: AOSSIE-Org/PictoPy
Length of output: 7301
Add comprehensive tests for session_registry concurrency logic.
This module implements critical race condition prevention for model lifecycle management and is actively used in production (imported in YOLO.py, FaceNet.py, and routes/models.py). Currently, it has no test coverage. Tests should verify:
- Correct increment/decrement behavior of session counts
- Proper rejection of sessions when marked for deletion
- Thread-safety under concurrent access
- Edge cases (e.g., multiple inactive calls, deletion attempts with active sessions)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/models/session_registry.py` around lines 1 - 44, Add
comprehensive unit tests for session_registry covering
mark_model_session_active, mark_model_session_inactive,
get_active_session_count, try_mark_model_for_deletion, and
release_model_deletion_mark: test basic increment/decrement semantics (multiple
mark_model_session_active increases count, mark_model_session_inactive decreases
and removes key when zero, extra inactive calls are no-ops), test that
try_mark_model_for_deletion returns active count when sessions exist and adds
model to _models_pending_deletion when none (and release_model_deletion_mark
removes that mark), and add concurrency tests using threading to spawn many
concurrent mark_model_session_active/mark_model_session_inactive calls to assert
final counts and that try_mark_model_for_deletion correctly detects active
sessions under race; include an explicit test that attempting to
mark_model_session_active while a model is pending deletion raises RuntimeError.
Ensure tests reset global registry state between cases.
Addressed Issue:
Fixes #1260
This pull request introduces a robust system for managing AI model files in PictoPy, ensuring models are only loaded if present and guiding users to install missing models. It adds a centralized model registry, improves the reliability of ONNX model loading, and updates the build process to exclude model files from the distribution. The main themes are model management, runtime safety, and build process improvements.
Model management and runtime safety:
MODEL_REGISTRYinbackend/app/models/model_registry.pyto define all supported AI models, their download URLs, hashes, and metadata. This registry is now the single source of truth for model management.FaceNetandYOLOmodel classes to lazily and safely load ONNX models using thread-safe logic. If a model file is missing, a clear error is raised instructing the user to install the model from the settings page. [1] [2]settings.pyto use a platform-appropriate user data directory when running as a packaged app, ensuring models are stored in the correct location in both development and production.Build process improvements:
PictoPy.specPyInstaller spec file that excludes.onnxmodel files from the build, keeping the distribution lightweight. Updated the GitHub Actions workflow and.gitignoreto use this spec file and prevent model files from being bundled. [1] [2] [3] [4] [5]These changes make model management more reliable, reduce the risk of runtime errors due to missing files, and streamline the build and deployment process.
Screenshots/Recordings:
AI Usage Disclosure:
We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.
Check one of the checkboxes below:
I have used the following AI models and tools: Claude, Copilot, Gemini
Summary by CodeRabbit
New Features
Improvements
Documentation