Skip to content

AI Models Based App Size Optimization#1263

Merged
rahulharpal1603 merged 7 commits into
AOSSIE-Org:mainfrom
rohan-pandeyy:feat/models-based-app-size-optimization
May 11, 2026
Merged

AI Models Based App Size Optimization#1263
rahulharpal1603 merged 7 commits into
AOSSIE-Org:mainfrom
rohan-pandeyy:feat/models-based-app-size-optimization

Conversation

@rohan-pandeyy
Copy link
Copy Markdown
Contributor

@rohan-pandeyy rohan-pandeyy commented May 7, 2026

Addressed Issue:

Fixes #1260

This pull request introduces a robust system for managing AI model files in PictoPy, ensuring models are only loaded if present and guiding users to install missing models. It adds a centralized model registry, improves the reliability of ONNX model loading, and updates the build process to exclude model files from the distribution. The main themes are model management, runtime safety, and build process improvements.

Model management and runtime safety:

  • Introduced a centralized MODEL_REGISTRY in backend/app/models/model_registry.py to define all supported AI models, their download URLs, hashes, and metadata. This registry is now the single source of truth for model management.
  • Refactored FaceNet and YOLO model classes to lazily and safely load ONNX models using thread-safe logic. If a model file is missing, a clear error is raised instructing the user to install the model from the settings page. [1] [2]
  • Updated the AI tagging sequence to ensure required models are present before processing images, improving robustness on first run or after updates. [1] [2]
  • Changed the model export path logic in settings.py to use a platform-appropriate user data directory when running as a packaged app, ensuring models are stored in the correct location in both development and production.

Build process improvements:

  • Added a custom PictoPy.spec PyInstaller spec file that excludes .onnx model files from the build, keeping the distribution lightweight. Updated the GitHub Actions workflow and .gitignore to use this spec file and prevent model files from being bundled. [1] [2] [3] [4] [5]

These changes make model management more reliable, reduce the risk of runtime errors due to missing files, and streamline the build and deployment process.

Screenshots/Recordings:

image

AI Usage Disclosure:

We encourage contributors to use AI tools responsibly when creating Pull Requests. While AI can be a valuable aid, it is essential to ensure that your contributions meet the task requirements, build successfully, include relevant tests, and pass all linters. Submissions that do not meet these standards may be closed without warning to maintain the quality and integrity of the project. Please take the time to understand the changes you are proposing and their impact. AI slop is strongly discouraged and may lead to banning and blocking. Do not spam our repos with AI slop.

Check one of the checkboxes below:

  • This PR does not contain AI-generated code at all.
  • This PR contains AI-generated code. I have read the AI Usage Policy and this PR complies with this policy. I have tested the code locally and I am responsible for it.

I have used the following AI models and tools: Claude, Copilot, Gemini

Summary by CodeRabbit

  • New Features

    • Onboarding step to install local AI vision models with hardware-based tier recommendation, selectable tiers, and streaming download progress.
    • Model-management API: status, hardware recommendation, tiered setup, single-model download (with progress), and delete.
  • Improvements

    • Resumable, verified model downloads with retries and integrity checks; runtime session tracking to prevent unsafe deletions.
    • Settings UI shows installed tiers and exposes GPU acceleration toggle.
    • Release build now fails if ONNX artifacts are bundled.
  • Documentation

    • Setup guides clarified for local vs. release model/runtime behavior.

Review Change Stack

@rohan-pandeyy rohan-pandeyy self-assigned this May 7, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Walkthrough

Removes bundled ONNX files from the installer and implements a modular system: model registry, resumable SHA‑256‑verified downloads, hardware detection, lazy ONNX session initialization, backend APIs (including SSE progress), frontend onboarding/settings integration, and a PyInstaller spec that excludes .onnx artifacts.

Changes

Modular AI Model Management System

Layer / File(s) Summary
Build & CI / Git ignore
.github/workflows/build-and-release.yml, .gitignore, backend/.gitignore
Server builds now use pyinstaller PictoPy.spec, manual app-copy steps removed, CI checks fail if *.onnx appear in backend/dist/PictoPy_Server; .gitignore ignores ONNX_Exports/ and ensures PictoPy.spec is tracked.
PyInstaller Spec
backend/PictoPy.spec
PyInstaller spec added: bundles app, lists hiddenimports, and filters out app/models/*.onnx from bundled data.
Runtime Paths
backend/app/config/settings.py
MODEL_EXPORTS_PATH computed dynamically: user-data directory when frozen, relative ../models/ONNX_Exports in development; model filename constants updated to use it.
Model Registry & Metadata
backend/app/models/model_registry.py
Adds ModelSpec TypedDict, MODEL_REGISTRY metadata entries, TIER_MODELS, and helpers to ensure/resolve exports directory and paths.
Model Download & Verification
backend/app/utils/model_downloader.py
verify_sha256() computes file SHA-256 in a worker thread; ensure_model() performs resumable HTTP downloads with httpx, handles 416, verifies SHA-256, and retries with exponential backoff.
Hardware Detection
backend/app/utils/hardware_detect.py
Detects physical NVIDIA GPUs via nvidia-smi, computes RAM-based tiers, and returns aggregated hardware info plus recommended tier.
Model Bootstrap Utility
backend/app/utils/model_bootstrap.py
ensure_ai_tagging_models() reads preferred YOLO tier from metadata, ensures required/tier models are present by invoking ensure_model().
Session Registry
backend/app/models/session_registry.py
Thread-safe active session counters plus pending-deletion tracking; APIs to mark active/inactive, try mark for deletion, and release deletion mark.
Lazy Model Session Loading
backend/app/models/YOLO.py, backend/app/models/FaceNet.py
Refactors eager ONNX session creation into lazy, thread-safe get_session() methods; inference()/get_embedding() use cached sessions; close()/__del__ clean up and update session registry.
Models Management API
backend/app/routes/models.py, backend/main.py
New FastAPI router mounted at /models exposing GET /status, GET /hardware, DELETE /{model_key}, POST /setup, POST /download/{model_key}, and GET /download/{task_id}/progress (SSE). Supports per-task queues, per-model locks, and stale-task cleanup integrated into lifespan.
Application Integration
backend/app/routes/folders.py, backend/main.py
post_AI_tagging_enabled_sequence now calls ensure_ai_tagging_models() before processing untagged images; models router registered and lifespan schedules cleanup task.
Dependencies
backend/requirements.txt
Updates httpx to 0.28.1, pins psutil==7.2.2, and removes APScheduler line.
Frontend Model Types
frontend/src/types/models.ts
Adds typed ModelTier, ModelFeature, model/hardware response interfaces, discriminated progress messages, and helper functions (isModelTier, normalizeModelTier, getInstalledModelTiers).
Onboarding Step Infrastructure
frontend/src/constants/steps.ts, frontend/src/features/onboardingSlice.ts, frontend/src/components/OnboardingSteps/OnboardingStep.tsx
Adds MODEL_SETUP_STEP, introduces OnboardingStepName typing, and wires the step renderer to include AIModelSetupStep.
AI Model Setup Onboarding Component
frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx
Component added: fetches hardware recommendation, starts tier setup, streams SSE progress, handles complete/error states, and updates preferences.
Settings Model Preferences Panel
frontend/src/pages/SettingsPage/components/UserPreferencesCard.tsx
Fetches model status on mount, computes installed tiers, renders dynamic YOLO tier dropdown with disabled states, and uses toggleGpuAcceleration() for GPU toggle.
OpenAPI & Docs
docs/backend/backend_python/openapi.json, docs/Manual_Setup_Guide.md, docs/Script_Setup_Guide.md
OpenAPI now documents Models endpoints and SetupRequest schema; setup guides updated with Miniconda verification and CPU-only local onnxruntime note.
Postponed Type Annotations
multiple backend modules
Adds from __future__ import annotations in several modules to defer type evaluation (selected schema/utils/model files).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Python, TypeScript/JavaScript, Documentation

Suggested reviewers

  • rahulharpal1603

Poem

🐰 I hop through registry and wire,
I skip the heavy model mire.
Hardware whispers which tier to fetch,
SHA checks pass, no corrupted sketch.
Light installer — users leap higher! ✨

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed All major coding requirements from issue #1260 have been implemented: MODEL_REGISTRY with metadata, lazy ONNX loading for YOLO/FaceNet, hardware detection, models_router with complete endpoints, PyInstaller spec excluding ONNX files, and frontend tier selector with SSE progress streaming.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the app size optimization objectives. Minor refactoring (e.g., from __future__ import annotations) and documentation updates are supporting changes. No unrelated modifications detected.
Title check ✅ Passed The title 'AI Models Based App Size Optimization' directly and accurately summarizes the main objective of the PR—optimizing app size by implementing centralized model management to exclude bundled ONNX files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/routes/folders.py (1)

235-245: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do model preflight synchronously before returning success.

enable_ai_tagging() returns success immediately after scheduling background work, but missing-model failures happen later and are only logged. This can report AI tagging as enabled when processing cannot actually start.

Suggested fix
 def enable_ai_tagging(request: UpdateAITaggingRequest, app_state=Depends(get_state)):
     """Enable AI tagging for multiple folders."""
     try:
         if not request.folder_ids:
             raise ValueError("No folder IDs provided")
 
+        # Validate model prerequisites before mutating state/responding success.
+        ensure_ai_tagging_models()
+
         updated_count = db_enable_ai_tagging_batch(request.folder_ids)
 
         executor: ProcessPoolExecutor = app_state.executor
         executor.submit(post_AI_tagging_enabled_sequence)
As per coding guidelines, "Confirm that the code meets the project's requirements and objectives".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/routes/folders.py` around lines 235 - 245, enable_ai_tagging
currently updates DB and schedules post_AI_tagging_enabled_sequence in the
background but doesn't verify models are available; perform the model preflight
synchronously before returning success by calling the model-check/preflight
routine (or invoking post_AI_tagging_enabled_sequence synchronously or a
dedicated ensure_models_available() helper) and raise/return an error if the
preflight fails, then only call db_enable_ai_tagging_batch and schedule the
background executor (app_state.executor.submit) after the preflight passes;
reference enable_ai_tagging, db_enable_ai_tagging_batch,
post_AI_tagging_enabled_sequence, and app_state.executor when making the change.
🧹 Nitpick comments (17)
backend/app/utils/hardware_detect.py (1)

64-71: ⚡ Quick win

Avoid duplicate GPU probing in one response path.

get_hardware_info() already computes gpu_names, but recommended_tier triggers another detect_physical_gpu() call through detect_hardware_tier(). Reuse the same snapshot to avoid extra subprocess overhead and inconsistent results.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/utils/hardware_detect.py` around lines 64 - 71, get_hardware_info
currently calls detect_physical_gpu() to populate gpu_names but then calls
detect_hardware_tier() which re-probes GPUs; change the flow so
detect_hardware_tier reuses the already-computed gpu_names instead of re-running
detection: update detect_hardware_tier to accept an optional gpu_names/list
parameter (or add a new helper detect_hardware_tier_from_list) and call it from
get_hardware_info with the gpu_names variable to compute recommended_tier,
avoiding the duplicate detect_physical_gpu() subprocess call and ensuring
consistent results.
backend/app/config/settings.py (1)

6-11: ⚡ Quick win

Keep model export path resolution in one place.

This frozen-vs-dev path selection now exists here and again in backend/app/models/model_registry.py. If those branches drift, model downloads and model loads will start targeting different directories. Please derive this from one shared helper/constant instead of duplicating the logic.

As per coding guidelines, "Look for code duplication."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/config/settings.py` around lines 6 - 11, Consolidate the
frozen-vs-dev model path logic into a single helper in settings by exposing a
single computed symbol (e.g., MODEL_EXPORTS_PATH or a function
get_model_exports_path()) that encapsulates the getattr(sys, "frozen", False)
branch; then remove the duplicate branch in backend/app/models/model_registry.py
and import and use that settings symbol instead so both model download and load
paths reference the same source of truth (replace any local path computation in
model_registry.py with MODEL_EXPORTS_PATH/get_model_exports_path()).
backend/app/models/YOLO.py (4)

22-24: ⚡ Quick win

Move threading import to module top-level.

import threading is repeated on every __init__ call. Hoist it to the module imports (alongside onnxruntime, etc.) for PEP 8 compliance and a tiny perf win.

Proposed diff
 import onnxruntime
 import time
+import threading
 import cv2
 import numpy as np
@@
     def __init__(self, path, conf_threshold=0.7, iou_threshold=0.5):
         self.model_path = path
         self.conf_threshold = conf_threshold
         self.iou_threshold = iou_threshold
         self._session = None
-        import threading
         self._lock = threading.Lock()

As per coding guidelines: "Check for PEP 8 violations and Python best practices."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/YOLO.py` around lines 22 - 24, The class __init__
currently performs a local import ("import threading") and then uses
threading.Lock() to set self._lock, causing repeated imports; move the import of
threading to the module top-level (alongside existing imports like onnxruntime)
and remove the in-method import so __init__ only does self._lock =
threading.Lock(), updating YOLO.__init__ accordingly.

33-46: ⚡ Quick win

Hoist imports and tighten model-key matching.

Two issues here:

  1. import os and from app.models.model_registry import MODEL_REGISTRY are inside get_session(); they should be at module top-level (PEP 8).
  2. spec["filename"] in self.model_path does a substring match against the full path. If a registry filename ever becomes a substring of another (e.g., a future yolo.onnx vs yolov8n.onnx), the wrong key wins. Compare against os.path.basename(self.model_path) with equality instead.
Proposed diff
-        with self._lock:
-            if self._session is None:
-                import os
-                from app.models.model_registry import MODEL_REGISTRY
-
-                if not os.path.exists(self.model_path):
-                    model_key = None
-                    for key, spec in MODEL_REGISTRY.items():
-                        if spec["filename"] in self.model_path:
-                            model_key = key
-                            break
-                    model_name = model_key if model_key else os.path.basename(self.model_path)
+        with self._lock:
+            if self._session is None:
+                if not os.path.exists(self.model_path):
+                    basename = os.path.basename(self.model_path)
+                    model_key = next(
+                        (k for k, spec in MODEL_REGISTRY.items() if spec["filename"] == basename),
+                        None,
+                    )
+                    model_name = model_key if model_key else basename

As per coding guidelines: "Check for PEP 8 violations and Python best practices."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/YOLO.py` around lines 33 - 46, Move the imports "import
os" and "from app.models.model_registry import MODEL_REGISTRY" out of
get_session() to the module top-level, and update the model lookup in
get_session() to compare os.path.basename(self.model_path) for equality against
spec["filename"] (instead of using the substring check spec["filename"] in
self.model_path) so MODEL_REGISTRY key selection (model_key) is exact; keep
raising the same RuntimeError with model_name derived from the matched key or
basename when not found.

84-93: 💤 Low value

Lock-free read of self._session in detail getters.

get_input_details() and get_output_details() read self._session directly. Today they're only called from inside get_session() while the lock is held and self._session is non-None, so this is fine. If anyone later calls these from outside get_session() after a close(), they'll hit AttributeError: 'NoneType'. Worth either (a) adding a docstring marking them as private/internal, or (b) routing them through get_session() so they remain robust.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/YOLO.py` around lines 84 - 93, The getters
get_input_details() and get_output_details() read self._session directly and can
raise AttributeError if called after close(); change them to obtain the session
via get_session() (or at minimum assert/check it's not None) before accessing
inputs/outputs: call sess = self.get_session() (or if you prefer an explicit
guard, raise a clear runtime error when self._session is None) and then use
sess.get_inputs()/sess.get_outputs(); update references to self._session in
get_input_details and get_output_details and add a short docstring noting they
rely on an active session.

60-64: 💤 Low value

close() is inconsistent with FaceNet.close().

FaceNet.close() also resets input_tensor_name/output_tensor_name under the lock, but here self.input_names/self.output_names/self.input_shape are left dangling. After close(), a subsequent get_session() will repopulate them via get_input_details()/get_output_details(), so this isn't a correctness bug, but the asymmetry across the two model classes is worth aligning for consistency and easier reasoning.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/YOLO.py` around lines 60 - 64, The YOLO.close() method
currently only clears self._session under the lock, leaving self.input_names,
self.output_names, and self.input_shape set; update YOLO.close() to mirror
FaceNet.close() by resetting self.input_names, self.output_names, and
self.input_shape (inside the same with self._lock block) when clearing the
session so the internal state is consistent and no dangling tensor metadata
remains.
frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx (4)

85-110: ⚡ Quick win

Status check has no abort/timeout; can wedge the spinner indefinitely.

If the backend at ${BACKEND_URL}/models/status is unreachable, the UI sits on "Verifying AI models…" until the OS-level fetch eventually times out. Two recommendations:

  1. Use an AbortController in the cleanup so an unmount/remount cancels the in-flight request.
  2. Add a reasonable timeout (e.g., 5–10s) so the user is moved to the recommendation view even if the backend is slow.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines 85
- 110, The model-status check in useEffect (checkModelStatus) can hang
indefinitely; wrap the fetch with an AbortController and a timeout so the
request is cancelled on cleanup or after ~5–10s: create an AbortController, pass
its signal to fetch in checkModelStatus, start a timeout that calls
controller.abort() after the chosen duration, clear the timeout and call
controller.abort() in the useEffect cleanup, and ensure
setIsCheckingStatus(false) and dispatch(markCompleted(stepIndex)) still run
appropriately when aborted or timed out.

144-156: ⚡ Quick win

Untyped await response.json() for /models/setup.

data is implicitly any here, then data.success / data.detail / data.task_id are accessed without a type. Define a small response interface in frontend/src/types/models.ts (e.g., SetupResponse) and type the parsed JSON to keep the Action's contract explicit.

Proposed diff
-      const response = await fetch(`${BACKEND_URL}/models/setup`, {
+      const response = await fetch(`${BACKEND_URL}/models/setup`, {
         method: 'POST',
         headers: { 'Content-Type': 'application/json' },
         body: JSON.stringify({ tier: selectedTier }),
       });
-
-      const data = await response.json();
+      const data: SetupResponse = await response.json();

As per coding guidelines: "Avoid 'any', use explicit types."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines
144 - 156, Add an explicit response type for the /models/setup call by creating
a SetupResponse interface in frontend/src/types/models.ts (fields: success:
boolean, detail?: string, task_id?: string) and import it into
AIModelSetupStep.tsx; then type the parsed JSON as SetupResponse (e.g., const
data: SetupResponse = await response.json()) so accesses to data.success /
data.detail / data.task_id are statically typed and conform to the Action
contract.

88-89: ⚖️ Poor tradeoff

Untyped JSON parse for /models/status and /models/hardware.

await res.json() returns any and is then cast positionally via const data: ModelStatusResponse = await res.json(). The cast is silently unsafe — if the backend ever drifts from the contract, runtime access on data.data won't be caught at compile time. Consider adding a lightweight runtime guard (e.g., zod, or a hand-written type predicate) to validate the shape before consumption, especially since getInstalledModelTiers iterates blindly over data.data.

As per coding guidelines: "Avoid 'any', use explicit types."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines 88
- 89, The fetch call uses await res.json() and casts it to ModelStatusResponse
(const data: ModelStatusResponse = await res.json()), which is unsafe; add a
runtime shape check for the /models/status (and similarly /models/hardware)
response before using data.data — e.g., define a lightweight validator (Zod
schema or a hand-written type predicate) that asserts the top-level object has
the expected fields (e.g., data: Array with required properties), parse/validate
the parsed JSON (instead of blind casting), and if validation fails handle the
error path (log/show an error and avoid iterating in getInstalledModelTiers),
ensuring BACKEND_URL fetch handling checks res.ok and uses the validated value
thereafter.

113-136: 💤 Low value

Hardware-recommendation effect re-fires on every transition back to recommendation.

Because viewMode is in the deps and the body branches on viewMode === 'recommendation', this effect re-fetches /models/hardware every time the user clicks "Try Again" from the error view. That's network-cheap, but it also overwrites selectedTier even if the user had already adjusted the dropdown. Consider either (a) only fetching once on mount and storing the result, or (b) preserving the user's last-selected tier when re-fetching.

Also, the cleanup eventSourceRef.current?.close() is shared with the download flow but lives inside this effect's cleanup, which means it runs on every viewMode change rather than only on unmount. Functionally fine (it just runs more often), but moving the SSE cleanup to its own unmount-only effect would make the lifecycle clearer.

As per coding guidelines: "Check for proper React hook dependencies and prevent unnecessary re-renders."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx` around lines
113 - 136, The effect currently refetches on every viewMode change and always
overwrites selectedTier and also closes eventSourceRef on every dep change;
change it so the hardware fetch only runs once (e.g., move the fetch into a
useEffect with empty deps or guard with a fetchedRef) and store the recommended
tier in state via setRecommendedTier without unconditionally calling
setSelectedTier if the user has already changed the dropdown (use a userChanged
flag or only setSelectedTier when selectedTier is undefined/default). Also
extract the SSE cleanup into its own unmount-only useEffect that returns () => {
if (eventSourceRef.current) eventSourceRef.current.close(); } so eventSourceRef
is closed only on unmount and not on every viewMode change; keep references to
useEffect, viewMode, fetch(.../models/hardware), setRecommendedTier,
setSelectedTier, and eventSourceRef to locate the changes.
backend/app/routes/models.py (3)

5-5: 💤 Low value

Unused import Optional.

Optional is imported from typing but never referenced.

Proposed diff
-from typing import Dict, Optional
+from typing import Dict
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/routes/models.py` at line 5, The import list in models.py
includes an unused symbol Optional from typing; remove Optional from the import
statement so only used types remain (e.g., change "from typing import Dict,
Optional" to "from typing import Dict") and run linters/tests to confirm no
other references to Optional exist in functions or classes in this module.

49-68: 💤 Low value

Sync handler does N stat calls on the event loop.

def get_model_status (no async) runs on the threadpool — fine — but if you switch it to async def later, the os.path.exists calls would block the loop. Either keep it sync (current state is OK) or, if changing to async, dispatch the loop via asyncio.to_thread. No change needed today; just be aware as the registry grows.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/routes/models.py` around lines 49 - 68, The current
get_model_status function performs blocking os.path.exists calls in a
synchronous handler; if you ever convert get_model_status to async, avoid
blocking the event loop by running filesystem checks in a thread using
asyncio.to_thread (e.g. wrap get_model_path/key existence checks), or simply
keep the function synchronous; update calls that reference MODEL_REGISTRY and
get_model_path accordingly so all os.path.exists checks are dispatched via
asyncio.to_thread when converting to async.

22-44: ⚖️ Poor tradeoff

download_tasks is shared mutable state without synchronization.

download_tasks is a module-global dict mutated from multiple coroutines (setup_models, start_download_model, event_generator's finally, _cleanup_stale_tasks). FastAPI under a single-worker event loop is OK because dict ops complete in one bytecode step, but if the deployment ever uses uvicorn with --workers > 1, each worker has its own dict and SSE progress for a task_id started on worker A is unfindable from worker B.

If multi-worker deployment is in scope, you'll need a shared backing store (Redis, etc.). For now, document the single-worker assumption and assert it on startup.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/routes/models.py` around lines 22 - 44, The module-global
download_tasks dict (used by download_tasks, _cleanup_stale_tasks, setup_models,
start_download_model, and event_generator) is unsafe for multi-worker
deployments; add a clear runtime assertion and docstring that we require a
single worker: on application startup check common env vars (e.g.
UVICORN_WORKERS, WEB_CONCURRENCY) and/or a configured worker count and raise a
RuntimeError if they indicate >1 (or log and exit), and update the module
docstring to state the single-worker assumption and that a shared backing store
(Redis) is required to support multiple workers in the future.
backend/app/models/FaceNet.py (1)

12-53: ⚖️ Poor tradeoff

Duplicated lazy-session-with-registry-lookup logic across FaceNet and YOLO.

The "fast-path read → lock → existence check → registry filename lookup → InferenceSession build" sequence here is essentially the same code as YOLO.get_session(). Consider extracting a small mixin/helper (e.g., _build_onnx_session(model_path) that raises the friendly RuntimeError and returns the session) so both classes share one implementation. Same comment about substring matching (spec["filename"] in self.model_path) applies — prefer equality against os.path.basename(self.model_path).

As per coding guidelines: "Look for code duplication."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/FaceNet.py` around lines 12 - 53, FaceNet.get_session
duplicates the lazy-session-with-registry-lookup logic also present in
YOLO.get_session; extract a shared helper (e.g., a module-level function or
mixin named _build_onnx_session(model_path)) that encapsulates the "check file
exists → lookup MODEL_REGISTRY → raise friendly RuntimeError → create
onnxruntime.InferenceSession and return it" logic, then have FaceNet.get_session
and YOLO.get_session call that helper and set
input_tensor_name/output_tensor_name from the returned session; also change the
registry lookup from substring matching (spec["filename"] in self.model_path) to
exact equality against os.path.basename(self.model_path) to avoid false matches.
docs/backend/backend_python/openapi.json (1)

1559-1761: ⚖️ Poor tradeoff

/models/* endpoints emit empty response schemas.

All six new /models endpoints have "schema": {} for the 200 body, because the corresponding FastAPI handlers in backend/app/routes/models.py return raw dicts without a response_model=. This degrades the generated OpenAPI quality (no client-side typing, no validation). Worth adding Pydantic response models (e.g., ModelStatusResponse, HardwareInfoResponse, SetupResponse, DeleteModelResponse, DownloadStartResponse) and wiring them with response_model=... so the spec — and downstream tooling — gets full types.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/backend/backend_python/openapi.json` around lines 1559 - 1761, The
/models/* endpoints currently emit empty response schemas because the FastAPI
handlers in backend/app/routes/models.py return raw dicts without
response_model; define Pydantic models (e.g., ModelStatusResponse,
HardwareInfoResponse, SetupResponse, DeleteModelResponse, DownloadStartResponse,
DownloadProgressResponse) in the existing schemas module (or create
backend/app/schemas/models.py) and update each route decorator/endpoint in
models.py to include response_model=<appropriate_model> so FastAPI generates
proper OpenAPI types and validation (match each operationId:
get_model_status_models_status_get,
get_hardware_recommendation_models_hardware_get, setup_models_models_setup_post,
delete_model_models__model_key__delete,
start_download_model_models_download__model_key__post,
download_progress_models_download__task_id__progress_get).
backend/app/utils/model_downloader.py (1)

28-40: 💤 Low value

Use lazy %-style logging, not f-strings.

logger.info(f"Model {model_key} ...") (and several siblings) eagerly formats even when the level is filtered out. Switch to logger.info("Model %s ...", model_key). This is a Python best-practice nit but applies uniformly throughout this new file.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/utils/model_downloader.py` around lines 28 - 40, The logger calls
in this module (e.g., inside ensure_model and any sibling functions) currently
use f-strings which eagerly format messages; replace them with lazy %-style
logging by changing statements like logger.info(f"...{model_key}...") to
logger.info("...%s...", model_key) (and similarly for logger.debug/warning/error
calls), ensuring all log calls in model_downloader.py use format string plus
arguments with the module-level logger name (logger) rather than f-strings so
message interpolation is deferred when the log level is disabled.
frontend/src/types/models.ts (1)

61-71: 💤 Low value

Hardcoded tier sizes can drift from the backend registry.

The ~21 MB / ~75.7 MB / ~161 MB numbers are baked into MODEL_TIER_LABELS here, but the authoritative size_mb lives in backend/app/models/model_registry.py and is also returned via /models/status per ModelEntry. If the backend bumps a tier's model size, this label will silently be wrong. Consider deriving the displayed size from the live /models/status payload (sum of object + face detection model size_mb for the tier), or at minimum adding a comment cross-referencing the registry as the source of truth.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/src/types/models.ts` around lines 61 - 71, Replace the hardcoded
size strings in MODEL_TIER_LABELS by deriving the displayed size from the
backend /models/status ModelEntry payload (sum of object + face detection
ModelEntry.size_mb for each tier) at runtime: update wherever MODEL_TIER_LABELS
is consumed to compute label text using the live sizes (e.g., "Nano (~{size}
MB)") instead of fixed values, falling back to the existing strings in
MODEL_TIER_DESCRIPTIONS if the API is unavailable; reference the frontend
consumers of MODEL_TIER_LABELS and the backend ModelEntry returned by
/models/status to locate where to fetch and format the sizes and ensure the
formatted string replaces the current hardcoded entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/build-and-release.yml:
- Around line 32-35: After the PyInstaller step that runs "cd backend &&
pyinstaller PictoPy.spec", add a post-build verification step that scans the
built artifact directory backend/dist/PictoPy_Server for any files matching
*.onnx and fails the job if any are found; implement this check as a separate
job step (named similarly to "Verify no .onnx artifacts") that returns
non‑zero/aborts on a match so the workflow fails, and repeat the same
verification step for the equivalent post-build locations referenced around the
other build stages (the sections corresponding to lines ~67-70 and ~100-103) to
ensure no .onnx files are bundled anywhere.

In `@backend/app/models/FaceNet.py`:
- Line 15: The PEP 604 union annotation on self._session
(onnxruntime.InferenceSession | None) requires Python 3.10+; either declare the
minimum Python version in packaging or make annotations forward-compatible: add
requires-python = ">=3.10" to backend/pyproject.toml, or add from __future__
import annotations at the top of backend/app/models/FaceNet.py (and any other
files using the `|` type syntax) so the annotations are treated as strings and
work on older interpreters.

In `@backend/app/routes/models.py`:
- Around line 230-255: The SSE event_generator uses
asyncio.wait_for(entry.queue.get(), timeout=60.0) which makes heartbeats only
every 60s and risks reverse-proxy timeouts; change the heartbeat timeout to
15–20s (e.g., 15s) in event_generator to emit frequent heartbeats and avoid
proxy disconnects, and prevent multiple consumers from draining the same
entry.queue by either rejecting new connections when an active listener exists
(check a flag like entry.listener_active or entry.listener_id before accepting a
second /download/{task_id}/progress request) or create a per-listener queue when
a new client connects and fan out messages from the single producer to each
listener queue; also ensure to clear the listener flag/queue in the finally
block where download_tasks.pop(task_id, None) is executed.
- Around line 88-111: The delete_model handler currently unconditionally removes
the model file; change it to first check a live-session registry (e.g., a global
dict mapping model_key -> list/instance of InferenceSession used by
FaceNet/YOLO) and if any active sessions exist return HTTP 409 CONFLICT with a
clear message (or, if desired, call .close() on those InferenceSession objects
before proceeding); use the MODEL_REGISTRY and get_model_path symbols to locate
the model, and replace the blocking os.remove(path) with await
asyncio.to_thread(os.remove, path) to avoid blocking the event loop; ensure the
error handling still returns 500 on unexpected failures and that the response
messages reference model_key.
- Around line 132-168: Concurrent downloads can race and corrupt the same
model_path because background_setup/ensure_model may run in parallel for the
same model_key; add deduplication by tracking in-flight downloads per model_key
and reusing or serializing them. Modify the logic that iterates
models_to_download / calls ensure_model so it first checks a new in_flight dict
(map model_key -> DownloadTaskEntry or asyncio.Lock) and if an entry exists
returns the existing task_id (or awaits the existing task) instead of spawning a
duplicate task; alternatively create a per-key asyncio.Lock (e.g.,
in_flight_locks: Dict[str, asyncio.Lock]) and wrap the ensure_model call with
async with in_flight_locks[model_key]: to serialize access, and ensure you clean
up the in_flight entry/lock on completion or error; update download_tasks and
DownloadTaskEntry usage so task lookup/cleanup is consistent.

In `@backend/app/utils/model_downloader.py`:
- Around line 86-98: The progress percent calculation can exceed 100 if the
server misreports Content-Length; update the two places where percent is
computed (inside the async download loops that check if progress_callback and
total_size > 0) to clamp the value before calling progress_callback (e.g.,
compute percent = (downloaded / total_size) * 100 then set percent =
min(percent, 100.0)); apply this change to both progress sites in the async
chunk loops that use downloaded, total_size, progress_callback, and response.
- Around line 64-98: The 416 branch currently nests a fresh client.stream inside
an active outer client.stream and also calls response.raise_for_status() after
deriving mode/total_size/downloaded; update the logic in model_downloader.py so
you first call response.raise_for_status() (or branch on response.status_code)
immediately after acquiring the outer response, then if response.status_code ==
416 close/exit the outer stream before starting a new client.stream("GET", url)
for a full download, remove the partial file (model_path) and perform the fresh
download into "wb" while reporting progress; ensure you still compute
total_size/downloaded only after opening the appropriate stream and not while
the outer response remains open.

In `@docs/Manual_Setup_Guide.md`:
- Around line 49-59: Add language identifiers ("bash") to the three fenced code
blocks containing the commands "cd frontend", "npm install", and "npm run tauri
dev" so the markdown shows as ```bash ... ``` instead of ``` ... ```; update
those fenced blocks in Manual_Setup_Guide.md to silence MD040 warnings and
ensure proper syntax highlighting.

In `@docs/Script_Setup_Guide.md`:
- Around line 45-46: Fix the Markdown blockquote by removing the blank line
inside the quoted paragraph that contains "The setup script installs the
CPU-only `onnxruntime` package for local development. The release workflow swaps
in platform-specific GPU providers, and the onboarding recommendation uses
hardware detection (rather than ONNX Runtime provider detection)." — ensure the
blockquote lines are contiguous (no empty line between them) so markdownlint
MD028 is not triggered.

In `@frontend/src/features/onboardingSlice.ts`:
- Around line 47-48: Compute the index once from state.stepStatus
(state.currentStepIndex = state.stepStatus.findIndex(s => !s)); then if that
index is -1 set state.currentStepName to the explicit completed terminal step
(e.g. STEPS.COMPLETED or whatever terminal value your STEPS enum uses) instead
of falling back to STEPS.SERVER_CHECK; otherwise set state.currentStepName =
STEP_NAMES[state.currentStepIndex]. This ensures the fully-completed case is
represented as completed rather than remapped to SERVER_CHECK; update the logic
around state.currentStepIndex and state.currentStepName accordingly.

---

Outside diff comments:
In `@backend/app/routes/folders.py`:
- Around line 235-245: enable_ai_tagging currently updates DB and schedules
post_AI_tagging_enabled_sequence in the background but doesn't verify models are
available; perform the model preflight synchronously before returning success by
calling the model-check/preflight routine (or invoking
post_AI_tagging_enabled_sequence synchronously or a dedicated
ensure_models_available() helper) and raise/return an error if the preflight
fails, then only call db_enable_ai_tagging_batch and schedule the background
executor (app_state.executor.submit) after the preflight passes; reference
enable_ai_tagging, db_enable_ai_tagging_batch, post_AI_tagging_enabled_sequence,
and app_state.executor when making the change.

---

Nitpick comments:
In `@backend/app/config/settings.py`:
- Around line 6-11: Consolidate the frozen-vs-dev model path logic into a single
helper in settings by exposing a single computed symbol (e.g.,
MODEL_EXPORTS_PATH or a function get_model_exports_path()) that encapsulates the
getattr(sys, "frozen", False) branch; then remove the duplicate branch in
backend/app/models/model_registry.py and import and use that settings symbol
instead so both model download and load paths reference the same source of truth
(replace any local path computation in model_registry.py with
MODEL_EXPORTS_PATH/get_model_exports_path()).

In `@backend/app/models/FaceNet.py`:
- Around line 12-53: FaceNet.get_session duplicates the
lazy-session-with-registry-lookup logic also present in YOLO.get_session;
extract a shared helper (e.g., a module-level function or mixin named
_build_onnx_session(model_path)) that encapsulates the "check file exists →
lookup MODEL_REGISTRY → raise friendly RuntimeError → create
onnxruntime.InferenceSession and return it" logic, then have FaceNet.get_session
and YOLO.get_session call that helper and set
input_tensor_name/output_tensor_name from the returned session; also change the
registry lookup from substring matching (spec["filename"] in self.model_path) to
exact equality against os.path.basename(self.model_path) to avoid false matches.

In `@backend/app/models/YOLO.py`:
- Around line 22-24: The class __init__ currently performs a local import
("import threading") and then uses threading.Lock() to set self._lock, causing
repeated imports; move the import of threading to the module top-level
(alongside existing imports like onnxruntime) and remove the in-method import so
__init__ only does self._lock = threading.Lock(), updating YOLO.__init__
accordingly.
- Around line 33-46: Move the imports "import os" and "from
app.models.model_registry import MODEL_REGISTRY" out of get_session() to the
module top-level, and update the model lookup in get_session() to compare
os.path.basename(self.model_path) for equality against spec["filename"] (instead
of using the substring check spec["filename"] in self.model_path) so
MODEL_REGISTRY key selection (model_key) is exact; keep raising the same
RuntimeError with model_name derived from the matched key or basename when not
found.
- Around line 84-93: The getters get_input_details() and get_output_details()
read self._session directly and can raise AttributeError if called after
close(); change them to obtain the session via get_session() (or at minimum
assert/check it's not None) before accessing inputs/outputs: call sess =
self.get_session() (or if you prefer an explicit guard, raise a clear runtime
error when self._session is None) and then use
sess.get_inputs()/sess.get_outputs(); update references to self._session in
get_input_details and get_output_details and add a short docstring noting they
rely on an active session.
- Around line 60-64: The YOLO.close() method currently only clears self._session
under the lock, leaving self.input_names, self.output_names, and
self.input_shape set; update YOLO.close() to mirror FaceNet.close() by resetting
self.input_names, self.output_names, and self.input_shape (inside the same with
self._lock block) when clearing the session so the internal state is consistent
and no dangling tensor metadata remains.

In `@backend/app/routes/models.py`:
- Line 5: The import list in models.py includes an unused symbol Optional from
typing; remove Optional from the import statement so only used types remain
(e.g., change "from typing import Dict, Optional" to "from typing import Dict")
and run linters/tests to confirm no other references to Optional exist in
functions or classes in this module.
- Around line 49-68: The current get_model_status function performs blocking
os.path.exists calls in a synchronous handler; if you ever convert
get_model_status to async, avoid blocking the event loop by running filesystem
checks in a thread using asyncio.to_thread (e.g. wrap get_model_path/key
existence checks), or simply keep the function synchronous; update calls that
reference MODEL_REGISTRY and get_model_path accordingly so all os.path.exists
checks are dispatched via asyncio.to_thread when converting to async.
- Around line 22-44: The module-global download_tasks dict (used by
download_tasks, _cleanup_stale_tasks, setup_models, start_download_model, and
event_generator) is unsafe for multi-worker deployments; add a clear runtime
assertion and docstring that we require a single worker: on application startup
check common env vars (e.g. UVICORN_WORKERS, WEB_CONCURRENCY) and/or a
configured worker count and raise a RuntimeError if they indicate >1 (or log and
exit), and update the module docstring to state the single-worker assumption and
that a shared backing store (Redis) is required to support multiple workers in
the future.

In `@backend/app/utils/hardware_detect.py`:
- Around line 64-71: get_hardware_info currently calls detect_physical_gpu() to
populate gpu_names but then calls detect_hardware_tier() which re-probes GPUs;
change the flow so detect_hardware_tier reuses the already-computed gpu_names
instead of re-running detection: update detect_hardware_tier to accept an
optional gpu_names/list parameter (or add a new helper
detect_hardware_tier_from_list) and call it from get_hardware_info with the
gpu_names variable to compute recommended_tier, avoiding the duplicate
detect_physical_gpu() subprocess call and ensuring consistent results.

In `@backend/app/utils/model_downloader.py`:
- Around line 28-40: The logger calls in this module (e.g., inside ensure_model
and any sibling functions) currently use f-strings which eagerly format
messages; replace them with lazy %-style logging by changing statements like
logger.info(f"...{model_key}...") to logger.info("...%s...", model_key) (and
similarly for logger.debug/warning/error calls), ensuring all log calls in
model_downloader.py use format string plus arguments with the module-level
logger name (logger) rather than f-strings so message interpolation is deferred
when the log level is disabled.

In `@docs/backend/backend_python/openapi.json`:
- Around line 1559-1761: The /models/* endpoints currently emit empty response
schemas because the FastAPI handlers in backend/app/routes/models.py return raw
dicts without response_model; define Pydantic models (e.g., ModelStatusResponse,
HardwareInfoResponse, SetupResponse, DeleteModelResponse, DownloadStartResponse,
DownloadProgressResponse) in the existing schemas module (or create
backend/app/schemas/models.py) and update each route decorator/endpoint in
models.py to include response_model=<appropriate_model> so FastAPI generates
proper OpenAPI types and validation (match each operationId:
get_model_status_models_status_get,
get_hardware_recommendation_models_hardware_get, setup_models_models_setup_post,
delete_model_models__model_key__delete,
start_download_model_models_download__model_key__post,
download_progress_models_download__task_id__progress_get).

In `@frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx`:
- Around line 85-110: The model-status check in useEffect (checkModelStatus) can
hang indefinitely; wrap the fetch with an AbortController and a timeout so the
request is cancelled on cleanup or after ~5–10s: create an AbortController, pass
its signal to fetch in checkModelStatus, start a timeout that calls
controller.abort() after the chosen duration, clear the timeout and call
controller.abort() in the useEffect cleanup, and ensure
setIsCheckingStatus(false) and dispatch(markCompleted(stepIndex)) still run
appropriately when aborted or timed out.
- Around line 144-156: Add an explicit response type for the /models/setup call
by creating a SetupResponse interface in frontend/src/types/models.ts (fields:
success: boolean, detail?: string, task_id?: string) and import it into
AIModelSetupStep.tsx; then type the parsed JSON as SetupResponse (e.g., const
data: SetupResponse = await response.json()) so accesses to data.success /
data.detail / data.task_id are statically typed and conform to the Action
contract.
- Around line 88-89: The fetch call uses await res.json() and casts it to
ModelStatusResponse (const data: ModelStatusResponse = await res.json()), which
is unsafe; add a runtime shape check for the /models/status (and similarly
/models/hardware) response before using data.data — e.g., define a lightweight
validator (Zod schema or a hand-written type predicate) that asserts the
top-level object has the expected fields (e.g., data: Array with required
properties), parse/validate the parsed JSON (instead of blind casting), and if
validation fails handle the error path (log/show an error and avoid iterating in
getInstalledModelTiers), ensuring BACKEND_URL fetch handling checks res.ok and
uses the validated value thereafter.
- Around line 113-136: The effect currently refetches on every viewMode change
and always overwrites selectedTier and also closes eventSourceRef on every dep
change; change it so the hardware fetch only runs once (e.g., move the fetch
into a useEffect with empty deps or guard with a fetchedRef) and store the
recommended tier in state via setRecommendedTier without unconditionally calling
setSelectedTier if the user has already changed the dropdown (use a userChanged
flag or only setSelectedTier when selectedTier is undefined/default). Also
extract the SSE cleanup into its own unmount-only useEffect that returns () => {
if (eventSourceRef.current) eventSourceRef.current.close(); } so eventSourceRef
is closed only on unmount and not on every viewMode change; keep references to
useEffect, viewMode, fetch(.../models/hardware), setRecommendedTier,
setSelectedTier, and eventSourceRef to locate the changes.

In `@frontend/src/types/models.ts`:
- Around line 61-71: Replace the hardcoded size strings in MODEL_TIER_LABELS by
deriving the displayed size from the backend /models/status ModelEntry payload
(sum of object + face detection ModelEntry.size_mb for each tier) at runtime:
update wherever MODEL_TIER_LABELS is consumed to compute label text using the
live sizes (e.g., "Nano (~{size} MB)") instead of fixed values, falling back to
the existing strings in MODEL_TIER_DESCRIPTIONS if the API is unavailable;
reference the frontend consumers of MODEL_TIER_LABELS and the backend ModelEntry
returned by /models/status to locate where to fetch and format the sizes and
ensure the formatted string replaces the current hardcoded entries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e7009d00-6d7a-4689-9598-f4a537822b68

📥 Commits

Reviewing files that changed from the base of the PR and between 0da2300 and 2cfd343.

📒 Files selected for processing (31)
  • .github/workflows/build-and-release.yml
  • .gitignore
  • backend/.gitignore
  • backend/PictoPy.spec
  • backend/app/config/settings.py
  • backend/app/models/FaceNet.py
  • backend/app/models/ONNX_Exports/FaceNet_128D.onnx
  • backend/app/models/ONNX_Exports/YOLOv11_Medium.onnx
  • backend/app/models/ONNX_Exports/YOLOv11_Medium_Face.onnx
  • backend/app/models/ONNX_Exports/YOLOv11_Nano.onnx
  • backend/app/models/ONNX_Exports/YOLOv11_Nano_Face.onnx
  • backend/app/models/ONNX_Exports/YOLOv11_Small.onnx
  • backend/app/models/ONNX_Exports/YOLOv11_Small_Face.onnx
  • backend/app/models/YOLO.py
  • backend/app/models/model_registry.py
  • backend/app/routes/folders.py
  • backend/app/routes/models.py
  • backend/app/utils/hardware_detect.py
  • backend/app/utils/model_bootstrap.py
  • backend/app/utils/model_downloader.py
  • backend/main.py
  • backend/requirements.txt
  • docs/Manual_Setup_Guide.md
  • docs/Script_Setup_Guide.md
  • docs/backend/backend_python/openapi.json
  • frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx
  • frontend/src/components/OnboardingSteps/OnboardingStep.tsx
  • frontend/src/constants/steps.ts
  • frontend/src/features/onboardingSlice.ts
  • frontend/src/pages/SettingsPage/components/UserPreferencesCard.tsx
  • frontend/src/types/models.ts

Comment thread .github/workflows/build-and-release.yml
Comment thread backend/app/models/FaceNet.py
Comment thread backend/app/routes/models.py Outdated
Comment thread backend/app/routes/models.py
Comment thread backend/app/routes/models.py
Comment thread backend/app/utils/model_downloader.py
Comment thread backend/app/utils/model_downloader.py Outdated
Comment thread docs/Manual_Setup_Guide.md
Comment on lines +45 to +46
> The setup script installs the CPU-only `onnxruntime` package for local development. The release workflow swaps in platform-specific GPU providers, and the onboarding recommendation uses hardware detection (rather than ONNX Runtime provider detection).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix blockquote formatting to avoid MD028 warning.

There is a blank line inside a blockquote sequence; markdownlint will flag this.

Suggested fix
-   > The setup script installs the CPU-only `onnxruntime` package for local development. The release workflow swaps in platform-specific GPU providers, and the onboarding recommendation uses hardware detection (rather than ONNX Runtime provider detection).
-
+   > The setup script installs the CPU-only `onnxruntime` package for local development. The release workflow swaps in platform-specific GPU providers, and the onboarding recommendation uses hardware detection (rather than ONNX Runtime provider detection).
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 46-46: Blank line inside blockquote

(MD028, no-blanks-blockquote)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/Script_Setup_Guide.md` around lines 45 - 46, Fix the Markdown blockquote
by removing the blank line inside the quoted paragraph that contains "The setup
script installs the CPU-only `onnxruntime` package for local development. The
release workflow swaps in platform-specific GPU providers, and the onboarding
recommendation uses hardware detection (rather than ONNX Runtime provider
detection)." — ensure the blockquote lines are contiguous (no empty line between
them) so markdownlint MD028 is not triggered.

Comment thread frontend/src/features/onboardingSlice.ts Outdated
- Add an explicit artifact check that fails if any .onnx file is still bundled.
- Use from __future__ import annotations for Python <3.9 compatibility.
- DELETE /{model_key} delete guard against active sessions.
- De-duplication for concurrent /setup or /download/{model_key} for the same model.
- SSE heartbeat interval.
- Nested HTTP streams on 416 raise_for_status() ordering.
- Fix divide-by-zero edge case.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/models/FaceNet.py`:
- Around line 28-66: get_session() can return a session that later has its
input/output names cleared by close(); capture the session and its tensor names
while holding self._lock and return them together (e.g., return (session,
input_name, output_name)) or snapshot self.input_tensor_name and
self.output_tensor_name into locals before releasing the lock, then update
get_embedding() to call get_session() and use the returned/snapshotted names
instead of reading attributes after get_session() returns; reference
get_session, get_embedding, self._lock, self.input_tensor_name and
self.output_tensor_name when making this change.

In `@backend/app/routes/models.py`:
- Around line 104-306: Add async tests covering delete_model (conflict when
get_active_session_count > 0 and successful deletion), per-model download
serialization using _get_model_download_lock and ensure_model to assert
concurrent downloads for same model are serialized, SSE single-listener behavior
and reconnect/cleanup via download_progress and DownloadTaskEntry/download_tasks
ensuring only one active listener is allowed and tasks are removed after
completion; implement these tests in the async test suite (use
TestClient/AsyncClient) to start background tasks, push progress messages to the
queue, simulate timeouts/heartbeats, assert HTTP 409 for second listener and
that task entries are popped on normal completion and remain when cancelled.
- Around line 115-128: The delete path uses a snapshot via
get_active_session_count and then removes the file (await
asyncio.to_thread(os.remove, path)) which races with concurrent session creation
(e.g., FaceNet.get_session / YOLO.get_session); instead, acquire the same
per-model lifecycle lock used by session creation before checking existence and
deleting: locate the lock manager used by session creation, wrap the sequence in
its lock (get_model_path -> check os.path.exists -> remove) so no new
InferenceSession can be created while deletion proceeds, and release the lock
afterwards.

In `@backend/PictoPy.spec`:
- Around line 8-29: Remove the broad Python package copy from PyInstaller spec
by deleting or changing the datas definition and its application: remove the
datas entry that sets datas=[('app','app')] and stop assigning a.datas =
exclude_models(a.datas); instead, restrict datas to only explicit non-Python
runtime assets (e.g., config or data file tuples) and if needed keep a small
whitelist rather than copying the entire 'app' package; update or remove the
exclude_models(...) helper (function exclude_models and its call a.datas =
exclude_models(a.datas)) so PyInstaller no longer redundantly bundles the 'app'
package.

In `@frontend/src/features/onboardingSlice.ts`:
- Line 8: The code relies on STEP_NAMES[STEP_NAMES.length - 1] to derive
TERMINAL_STEP_NAME which implicitly depends on the insertion order of STEPS;
change this to use an explicit terminal identifier (e.g., add and reference a
named constant like STEPS.TERMINAL or STEPS.UPDATE_STEP) or add a runtime
assertion that STEP_NAMES[last] equals the expected terminal key and document
the ordering requirement; update all uses of TERMINAL_STEP_NAME accordingly and
reference the symbols TERMINAL_STEP_NAME, STEP_NAMES and STEPS when making the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f74877e3-54f8-4fe8-b742-fcbb5165b33d

📥 Commits

Reviewing files that changed from the base of the PR and between 2cfd343 and b011a44.

📒 Files selected for processing (20)
  • .github/workflows/build-and-release.yml
  • backend/PictoPy.spec
  • backend/app/models/FaceNet.py
  • backend/app/models/ObjectClassifier.py
  • backend/app/models/YOLO.py
  • backend/app/models/model_registry.py
  • backend/app/models/session_registry.py
  • backend/app/routes/models.py
  • backend/app/schemas/images.py
  • backend/app/schemas/test.py
  • backend/app/utils/hardware_detect.py
  • backend/app/utils/images.py
  • backend/app/utils/model_downloader.py
  • docs/Manual_Setup_Guide.md
  • docs/Script_Setup_Guide.md
  • frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx
  • frontend/src/components/OnboardingSteps/OnboardingStep.tsx
  • frontend/src/features/onboardingSlice.ts
  • frontend/src/pages/SettingsPage/components/UserPreferencesCard.tsx
  • frontend/src/types/models.ts
✅ Files skipped from review due to trivial changes (2)
  • backend/app/models/ObjectClassifier.py
  • backend/app/utils/images.py
🚧 Files skipped from review as they are similar to previous changes (9)
  • backend/app/models/model_registry.py
  • frontend/src/types/models.ts
  • backend/app/models/YOLO.py
  • frontend/src/pages/SettingsPage/components/UserPreferencesCard.tsx
  • backend/app/utils/hardware_detect.py
  • backend/app/utils/model_downloader.py
  • .github/workflows/build-and-release.yml
  • frontend/src/components/OnboardingSteps/AIModelSetupStep.tsx
  • frontend/src/components/OnboardingSteps/OnboardingStep.tsx

Comment thread backend/app/models/FaceNet.py Outdated
Comment thread backend/app/routes/models.py
Comment thread backend/app/routes/models.py Outdated
Comment thread backend/PictoPy.spec
Comment thread frontend/src/features/onboardingSlice.ts
- snapshot tensor names to avoid close() invalidating session.run() inputs.
- lock model deletion to prevent delete-while-in-use races
@rohan-pandeyy rohan-pandeyy changed the title Feat/models based app size optimization AI Models Based App Size Optimization May 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
backend/app/models/session_registry.py (3)

1-7: ⚡ Quick win

Add module docstring to document purpose and threading model.

The module lacks a docstring explaining its role in preventing model deletion races and its thread-safety guarantees. This would help future maintainers understand the concurrency model and the relationship between active sessions and pending deletions.

📝 Suggested module docstring
 from __future__ import annotations
 
+"""
+Thread-safe registry for tracking active ONNX model sessions and pending deletions.
+
+Prevents race conditions where a model is deleted while sessions are still active.
+All operations are guarded by a module-level lock to ensure atomicity.
+"""
+
 import threading
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/session_registry.py` around lines 1 - 7, Add a
module-level docstring at the top of session_registry.py that states the
module's purpose (tracking active sessions to prevent model deletion races),
describes the threading model and concurrency guarantees, and documents the
shared globals _active_sessions, _models_pending_deletion and the use of
_registry_lock to protect them; keep it concise, mention that all access to
_active_sessions and _models_pending_deletion must be performed while holding
_registry_lock (or via provided helper functions) and note expected types
(dict[str,int], set[str]) for maintainers.

10-44: ⚡ Quick win

Add docstrings to all public functions.

All five public functions lack docstrings. Given that this module provides a critical concurrency primitive, documenting the expected behavior, parameters, return values, and exceptions would significantly improve maintainability.

📝 Suggested function docstrings
 def mark_model_session_active(model_key: str) -> None:
+    """
+    Mark a model session as active, incrementing its reference count.
+    
+    Args:
+        model_key: Unique identifier for the model.
+        
+    Raises:
+        RuntimeError: If the model is marked for deletion.
+    """
     with _registry_lock:
 def mark_model_session_inactive(model_key: str) -> None:
+    """
+    Mark a model session as inactive, decrementing its reference count.
+    
+    Removes the model from the active sessions map when count reaches zero.
+    Safe to call even if count is already zero (no-op).
+    
+    Args:
+        model_key: Unique identifier for the model.
+    """
     with _registry_lock:
 def get_active_session_count(model_key: str) -> int:
+    """
+    Get the current count of active sessions for a model.
+    
+    Args:
+        model_key: Unique identifier for the model.
+        
+    Returns:
+        Number of active sessions, or 0 if none.
+    """
     with _registry_lock:
 def try_mark_model_for_deletion(model_key: str) -> int | None:
+    """
+    Attempt to mark a model for deletion.
+    
+    Args:
+        model_key: Unique identifier for the model.
+        
+    Returns:
+        None if successfully marked for deletion (no active sessions).
+        Active session count if deletion cannot proceed (sessions exist).
+    """
     with _registry_lock:
 def release_model_deletion_mark(model_key: str) -> None:
+    """
+    Remove the deletion mark from a model.
+    
+    Safe to call even if the model is not marked for deletion.
+    
+    Args:
+        model_key: Unique identifier for the model.
+    """
     with _registry_lock:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/session_registry.py` around lines 10 - 44, Add
comprehensive docstrings to each public function (mark_model_session_active,
mark_model_session_inactive, get_active_session_count,
try_mark_model_for_deletion, release_model_deletion_mark) describing
purpose/behavior, parameters and their types, return values and types, raised
exceptions (e.g., RuntimeError when attempting to start a session for a model
pending deletion), and concurrency/thread-safety guarantees (that they are
protected by _registry_lock). Keep them concise, follow project docstring style
(short summary + details), and ensure they mention side effects on
_active_sessions and _models_pending_deletion where applicable.

19-25: 💤 Low value

Consider documenting or validating the edge case when count is 0.

The function silently handles the case where mark_model_session_inactive is called when no sessions are active (count == 0). While this is safe due to the defensive pop(model_key, None), it may indicate a programming error in the calling code (mismatched active/inactive calls).

Consider either:

  1. Adding a comment explaining that count <= 1 handles both "last session" (count==1) and "defensive no-op" (count==0) cases, or
  2. Adding validation that raises an exception if count is 0, which would help catch bugs in session lifecycle management.

The current implementation is safe but may hide bugs in FaceNet/YOLO close() logic if those are called multiple times.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/session_registry.py` around lines 19 - 25,
mark_model_session_inactive currently treats count <= 1 the same and silently
no-ops when count == 0; update the function (mark_model_session_inactive) to
detect the count==0 case and raise a clear exception (e.g.,
ValueError("mark_model_session_inactive called with no active sessions for
{model_key}")) to catch mismatched open/close calls, while keeping the existing
lock (_registry_lock) and decrement/pop logic for count==1; also add a one-line
comment above the check explaining that count==1 removes the last session and
count==0 is considered a programming error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/app/models/session_registry.py`:
- Around line 1-44: Add comprehensive unit tests for session_registry covering
mark_model_session_active, mark_model_session_inactive,
get_active_session_count, try_mark_model_for_deletion, and
release_model_deletion_mark: test basic increment/decrement semantics (multiple
mark_model_session_active increases count, mark_model_session_inactive decreases
and removes key when zero, extra inactive calls are no-ops), test that
try_mark_model_for_deletion returns active count when sessions exist and adds
model to _models_pending_deletion when none (and release_model_deletion_mark
removes that mark), and add concurrency tests using threading to spawn many
concurrent mark_model_session_active/mark_model_session_inactive calls to assert
final counts and that try_mark_model_for_deletion correctly detects active
sessions under race; include an explicit test that attempting to
mark_model_session_active while a model is pending deletion raises RuntimeError.
Ensure tests reset global registry state between cases.

---

Nitpick comments:
In `@backend/app/models/session_registry.py`:
- Around line 1-7: Add a module-level docstring at the top of
session_registry.py that states the module's purpose (tracking active sessions
to prevent model deletion races), describes the threading model and concurrency
guarantees, and documents the shared globals _active_sessions,
_models_pending_deletion and the use of _registry_lock to protect them; keep it
concise, mention that all access to _active_sessions and
_models_pending_deletion must be performed while holding _registry_lock (or via
provided helper functions) and note expected types (dict[str,int], set[str]) for
maintainers.
- Around line 10-44: Add comprehensive docstrings to each public function
(mark_model_session_active, mark_model_session_inactive,
get_active_session_count, try_mark_model_for_deletion,
release_model_deletion_mark) describing purpose/behavior, parameters and their
types, return values and types, raised exceptions (e.g., RuntimeError when
attempting to start a session for a model pending deletion), and
concurrency/thread-safety guarantees (that they are protected by
_registry_lock). Keep them concise, follow project docstring style (short
summary + details), and ensure they mention side effects on _active_sessions and
_models_pending_deletion where applicable.
- Around line 19-25: mark_model_session_inactive currently treats count <= 1 the
same and silently no-ops when count == 0; update the function
(mark_model_session_inactive) to detect the count==0 case and raise a clear
exception (e.g., ValueError("mark_model_session_inactive called with no active
sessions for {model_key}")) to catch mismatched open/close calls, while keeping
the existing lock (_registry_lock) and decrement/pop logic for count==1; also
add a one-line comment above the check explaining that count==1 removes the last
session and count==0 is considered a programming error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d8a26501-adcd-4eb2-a57a-380e45098f90

📥 Commits

Reviewing files that changed from the base of the PR and between b011a44 and e091507.

📒 Files selected for processing (6)
  • backend/app/models/FaceNet.py
  • backend/app/models/YOLO.py
  • backend/app/models/model_registry.py
  • backend/app/models/session_registry.py
  • backend/app/routes/models.py
  • backend/app/utils/model_bootstrap.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • backend/app/utils/model_bootstrap.py
  • backend/app/models/FaceNet.py
  • backend/app/models/model_registry.py
  • backend/app/models/YOLO.py
  • backend/app/routes/models.py

Comment on lines +1 to +44
from __future__ import annotations

import threading

_active_sessions: dict[str, int] = {}
_models_pending_deletion: set[str] = set()
_registry_lock = threading.Lock()


def mark_model_session_active(model_key: str) -> None:
with _registry_lock:
if model_key in _models_pending_deletion:
raise RuntimeError(
f"Model '{model_key}' is being deleted; cannot start a new session."
)
_active_sessions[model_key] = _active_sessions.get(model_key, 0) + 1


def mark_model_session_inactive(model_key: str) -> None:
with _registry_lock:
count = _active_sessions.get(model_key, 0)
if count <= 1:
_active_sessions.pop(model_key, None)
return
_active_sessions[model_key] = count - 1


def get_active_session_count(model_key: str) -> int:
with _registry_lock:
return _active_sessions.get(model_key, 0)


def try_mark_model_for_deletion(model_key: str) -> int | None:
with _registry_lock:
count = _active_sessions.get(model_key, 0)
if count > 0:
return count
_models_pending_deletion.add(model_key)
return None


def release_model_deletion_mark(model_key: str) -> None:
with _registry_lock:
_models_pending_deletion.discard(model_key)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for test files covering session_registry.py

# Search for test files that import or reference session_registry
rg -l 'session_registry' --glob '*test*.py'

# Also check for any test files in backend/tests or backend/app/tests
fd -t f 'test.*\.py$' backend/

Repository: AOSSIE-Org/PictoPy

Length of output: 293


🏁 Script executed:

# Search in test files for session_registry references
rg 'session_registry|_active_sessions|mark_model_session' backend/tests/ backend/test.py --type py

# Also check if session_registry is imported anywhere in the codebase
rg 'from.*session_registry|import.*session_registry' --type py

# Check what imports this module
rg 'mark_model_session_active|mark_model_session_inactive|get_active_session_count|try_mark_model_for_deletion|release_model_deletion_mark' --type py

Repository: AOSSIE-Org/PictoPy

Length of output: 1669


🏁 Script executed:

# Check if YOLO.py or FaceNet.py are tested
rg 'test.*yolo|test.*facenet|YOLO|FaceNet' backend/tests/ -i

# Check conftest.py and test files to see what they test
cat -n backend/tests/conftest.py

# List what each test file tests
head -20 backend/tests/test_albums.py
head -20 backend/tests/test_face_clusters.py
head -20 backend/tests/test_folders.py
head -20 backend/tests/test_user_preferences.py

Repository: AOSSIE-Org/PictoPy

Length of output: 7301


Add comprehensive tests for session_registry concurrency logic.

This module implements critical race condition prevention for model lifecycle management and is actively used in production (imported in YOLO.py, FaceNet.py, and routes/models.py). Currently, it has no test coverage. Tests should verify:

  • Correct increment/decrement behavior of session counts
  • Proper rejection of sessions when marked for deletion
  • Thread-safety under concurrent access
  • Edge cases (e.g., multiple inactive calls, deletion attempts with active sessions)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/app/models/session_registry.py` around lines 1 - 44, Add
comprehensive unit tests for session_registry covering
mark_model_session_active, mark_model_session_inactive,
get_active_session_count, try_mark_model_for_deletion, and
release_model_deletion_mark: test basic increment/decrement semantics (multiple
mark_model_session_active increases count, mark_model_session_inactive decreases
and removes key when zero, extra inactive calls are no-ops), test that
try_mark_model_for_deletion returns active count when sessions exist and adds
model to _models_pending_deletion when none (and release_model_deletion_mark
removes that mark), and add concurrency tests using threading to spawn many
concurrent mark_model_session_active/mark_model_session_inactive calls to assert
final counts and that try_mark_model_for_deletion correctly detects active
sessions under race; include an explicit test that attempting to
mark_model_session_active while a model is pending deletion raises RuntimeError.
Ensure tests reset global registry state between cases.

Copy link
Copy Markdown
Contributor

@rahulharpal1603 rahulharpal1603 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rahulharpal1603 rahulharpal1603 merged commit f8ec976 into AOSSIE-Org:main May 11, 2026
11 checks passed
@rohan-pandeyy rohan-pandeyy deleted the feat/models-based-app-size-optimization branch May 12, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Application Size Optimization (Modular AI Model Management)

2 participants