Cu-86d2e1ka4: simplify inference module for image embedding generation only#262
Conversation
📝 WalkthroughWalkthroughRemoved cloud storage and model-inference infrastructure; streamlined the app to image-embedding only, added a batch embeddings endpoint and shared base64 decoding helper, removed image-clarity/preprocessing/model-repo services, and updated DI/startup to preload the image-embedding singleton. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant API as FastAPI App
participant Ctrl as InferenceController
participant Embed as ImageEmbedding
participant DI as InferenceAppContainer
Note over DI,Embed: App startup
DI->>Embed: preload image_embedding()
Note over DI,Embed: models/embedders initialized
Client->>API: POST /v1/query/embeddings/batch (image_batch)
API->>Ctrl: forward request payload
Ctrl->>Ctrl: extract_decoded_image_data on each entry
Ctrl->>Embed: query_embed_batch(list[bytes])
Embed->>Embed: decode images, batch-process, normalize
Embed-->>Ctrl: batched embeddings
Ctrl-->>API: 200 OK with embeddings
API-->>Client: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
wavefront/server/apps/inference_app/inference_app/service/image_embedding.py (1)
17-26:⚠️ Potential issue | 🔴 Critical
AttributeError: class references non-existent attributesself.CLIP_MODEL_NAMEandself.DINO_MODEL_NAME.The model name constants were moved to module-level (lines 8-9), but the
__init__method still references them viaself.CLIP_MODEL_NAMEandself.DINO_MODEL_NAME. This will raiseAttributeErrorat runtime.🐛 Proposed fix
- self.clip_processor = CLIPProcessor.from_pretrained(self.CLIP_MODEL_NAME) - self.clip_model = CLIPModel.from_pretrained(self.CLIP_MODEL_NAME).to( + self.clip_processor = CLIPProcessor.from_pretrained(CLIP_MODEL_NAME) + self.clip_model = CLIPModel.from_pretrained(CLIP_MODEL_NAME).to( self.device ) self.clip_model.eval() - self.dino_processor = AutoImageProcessor.from_pretrained(self.DINO_MODEL_NAME) + self.dino_processor = AutoImageProcessor.from_pretrained(DINO_MODEL_NAME) self.dino_model = AutoModel.from_pretrained( - self.DINO_MODEL_NAME, trust_remote_code=True + DINO_MODEL_NAME, trust_remote_code=True ).to(self.device)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py` around lines 17 - 26, The __init__ references non-existent instance attributes self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME; update the CLIP and DINO model construction to use the module-level constants CLIP_MODEL_NAME and DINO_MODEL_NAME instead. Specifically, change the calls that create self.clip_processor (CLIPProcessor.from_pretrained), self.clip_model (CLIPModel.from_pretrained), self.dino_processor (AutoImageProcessor.from_pretrained) and self.dino_model (AutoModel.from_pretrained) to pass the module-level names CLIP_MODEL_NAME and DINO_MODEL_NAME rather than self.CLIP_MODEL_NAME/self.DINO_MODEL_NAME so the models initialize without raising AttributeError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`:
- Around line 72-75: The helper extract_decoded_image_data currently calls
base64.b64decode which can raise binascii.Error on malformed input; wrap the
decode call in a try/except that catches binascii.Error (import binascii) and
re-raise a clear ValueError or custom exception (e.g., "Invalid base64 image
data") so caller endpoints can return a 4xx response instead of an unhandled
500; update any call sites (the single-image and multi-image endpoints) to catch
that ValueError and convert it to an appropriate HTTP error response.
- Around line 51-69: The batch endpoint image_embedding_batch currently always
returns HTTP 200 and doesn't handle decode errors; update image_embedding_batch
to (1) catch binascii.Error (raised by extract_decoded_image_data) and return an
HTTP 400 using response_formatter.buildSuccessResponse or an appropriate error
response, and (2) after calling image_embedding_service.query_embed_batch, check
if embeddings is empty or falsy and return HTTP 400 (mirroring the single-image
flow) instead of always returning 200; reference extract_decoded_image_data,
query_embed_batch, image_embedding_batch, and
response_formatter.buildSuccessResponse to locate where to add the try/except
and the empty-result conditional.
In
`@wavefront/server/apps/inference_app/inference_app/inference_app_container.py`:
- Around line 6-9: The failure is caused by server.py calling
InferenceAppContainer(cache_manager=None) but the container no longer declares a
cache_manager provider; restore compatibility by adding a dependency provider
named cache_manager to InferenceAppContainer (e.g., add a line cache_manager =
providers.Dependency() in the class) so the existing instantiation with
cache_manager=None succeeds, or alternatively remove the cache_manager argument
from the call site in server.py; update the symbol InferenceAppContainer to
include cache_manager = providers.Dependency() if you choose the former.
In
`@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`:
- Around line 92-100: The current loop that opens images (the block that calls
Image.open on items from image_batch and appends to images) silently returns []
on any decode error, making failures indistinguishable from an empty input;
change this to raise a descriptive exception instead: when an exception occurs
while opening an image (use the same except block that catches Exception as e),
log the error and then raise a ValueError (or a custom DecodeError) that
includes the failing index (idx) and the original exception (e) so callers can
distinguish a decode failure from an empty batch and handle it explicitly.
---
Outside diff comments:
In
`@wavefront/server/apps/inference_app/inference_app/service/image_embedding.py`:
- Around line 17-26: The __init__ references non-existent instance attributes
self.CLIP_MODEL_NAME and self.DINO_MODEL_NAME; update the CLIP and DINO model
construction to use the module-level constants CLIP_MODEL_NAME and
DINO_MODEL_NAME instead. Specifically, change the calls that create
self.clip_processor (CLIPProcessor.from_pretrained), self.clip_model
(CLIPModel.from_pretrained), self.dino_processor
(AutoImageProcessor.from_pretrained) and self.dino_model
(AutoModel.from_pretrained) to pass the module-level names CLIP_MODEL_NAME and
DINO_MODEL_NAME rather than self.CLIP_MODEL_NAME/self.DINO_MODEL_NAME so the
models initialize without raising AttributeError.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3329984f-e9a3-4146-8a0d-c1c121a392b4
📒 Files selected for processing (7)
wavefront/server/apps/inference_app/inference_app/config.iniwavefront/server/apps/inference_app/inference_app/controllers/inference_controller.pywavefront/server/apps/inference_app/inference_app/inference_app_container.pywavefront/server/apps/inference_app/inference_app/service/image_analyser.pywavefront/server/apps/inference_app/inference_app/service/image_embedding.pywavefront/server/apps/inference_app/inference_app/service/model_inference.pywavefront/server/apps/inference_app/inference_app/service/model_repository.py
💤 Files with no reviewable changes (4)
- wavefront/server/apps/inference_app/inference_app/service/image_analyser.py
- wavefront/server/apps/inference_app/inference_app/config.ini
- wavefront/server/apps/inference_app/inference_app/service/model_repository.py
- wavefront/server/apps/inference_app/inference_app/service/model_inference.py
| def extract_decoded_image_data(image_data: str) -> bytes: | ||
| parts = image_data.split(',') | ||
| base64_data = parts[1] if len(parts) == 2 else parts[0] | ||
| return base64.b64decode(base64_data) |
There was a problem hiding this comment.
extract_decoded_image_data can raise unhandled binascii.Error on malformed base64.
base64.b64decode raises binascii.Error for invalid base64 input. This function is called directly without try/except in both endpoints. For the single-image endpoint (line 36), this would cause an HTTP 500. Consider adding error handling here or at call sites.
💡 Option: Add validation with a clearer error
+import binascii
+
+class InvalidImageDataError(Exception):
+ pass
+
def extract_decoded_image_data(image_data: str) -> bytes:
parts = image_data.split(',')
base64_data = parts[1] if len(parts) == 2 else parts[0]
- return base64.b64decode(base64_data)
+ try:
+ return base64.b64decode(base64_data)
+ except binascii.Error as e:
+ raise InvalidImageDataError('Invalid base64 encoded image data') from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def extract_decoded_image_data(image_data: str) -> bytes: | |
| parts = image_data.split(',') | |
| base64_data = parts[1] if len(parts) == 2 else parts[0] | |
| return base64.b64decode(base64_data) | |
| import binascii | |
| class InvalidImageDataError(Exception): | |
| pass | |
| def extract_decoded_image_data(image_data: str) -> bytes: | |
| parts = image_data.split(',') | |
| base64_data = parts[1] if len(parts) == 2 else parts[0] | |
| try: | |
| return base64.b64decode(base64_data) | |
| except binascii.Error as e: | |
| raise InvalidImageDataError('Invalid base64 encoded image data') from e |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`
around lines 72 - 75, The helper extract_decoded_image_data currently calls
base64.b64decode which can raise binascii.Error on malformed input; wrap the
decode call in a try/except that catches binascii.Error (import binascii) and
re-raise a clear ValueError or custom exception (e.g., "Invalid base64 image
data") so caller endpoints can return a 4xx response instead of an unhandled
500; update any call sites (the single-image and multi-image endpoints) to catch
that ValueError and convert it to an appropriate HTTP error response.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`:
- Around line 63-74: The batch endpoint currently only catches binascii.Error
when decoding images but not ValueError raised by
image_embedding_service.query_embed_batch (per image_embedding.py) when PIL
fails to decode a corrupted image; update the try/except around creating and
embedding the image batch to also catch ValueError (in addition to
binascii.Error) and return the same JSONResponse with status
HTTP_400_BAD_REQUEST and the 'Invalid base64 image data in batch' error via
response_formatter.buildErrorResponse so corrupted-but-base64 images produce 400
instead of 500; you should reference extract_decoded_image_data,
payload.image_batch, and image_embedding_service.query_embed_batch when locating
the code to change.
- Around line 36-38: The single-image endpoint calls
extract_decoded_image_data(payload.image_data) and then
image_embedding_service.query_embed(image_data) without handling malformed
base64; wrap the call to extract_decoded_image_data in a try/except that catches
binascii.Error (same as the batch endpoint) and convert it into a user-facing
400 response (e.g., raise HTTPException(status_code=400, detail="Malformed
base64 image data") or return the same error object used by the batch path) so
malformed inputs return 400 instead of propagating to the global 500 handler.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7cd5d164-49ab-450d-8fc8-9c2d0b8b4232
⛔ Files ignored due to path filters (1)
wavefront/server/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
wavefront/server/apps/floconsole/pyproject.tomlwavefront/server/apps/inference_app/inference_app/controllers/inference_controller.pywavefront/server/apps/inference_app/inference_app/server.pywavefront/server/apps/inference_app/inference_app/service/image_embedding.pywavefront/server/apps/inference_app/pyproject.tomlwavefront/server/modules/insights_module/pyproject.toml
✅ Files skipped from review due to trivial changes (3)
- wavefront/server/modules/insights_module/pyproject.toml
- wavefront/server/apps/floconsole/pyproject.toml
- wavefront/server/apps/inference_app/pyproject.toml
| # 1. Decode Base64 string | ||
| image_data = extract_decoded_image_data(payload.image_data) | ||
| embeddings = image_embedding_service.query_embed(image_data) |
There was a problem hiding this comment.
Single endpoint missing binascii.Error handling.
The batch endpoint (lines 63-73) wraps the decode in a try/except for binascii.Error, but the single endpoint calls extract_decoded_image_data without similar protection. Malformed base64 input will propagate to the global exception handler and return HTTP 500 instead of a user-friendly 400.
🛡️ Proposed fix for consistent error handling
`@inference_app_router.post`('/v1/query/embeddings')
`@inject`
async def image_embedding(
payload: ImagePayload,
response_formatter: ResponseFormatter = Depends(
Provide[CommonContainer.response_formatter]
),
image_embedding_service: ImageEmbedding = Depends(
Provide[InferenceAppContainer.image_embedding]
),
):
- # 1. Decode Base64 string
- image_data = extract_decoded_image_data(payload.image_data)
+ try:
+ image_data = extract_decoded_image_data(payload.image_data)
+ except binascii.Error:
+ return JSONResponse(
+ status_code=status.HTTP_400_BAD_REQUEST,
+ content=response_formatter.buildErrorResponse(
+ 'Invalid base64 image data'
+ ),
+ )
embeddings = image_embedding_service.query_embed(image_data)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@wavefront/server/apps/inference_app/inference_app/controllers/inference_controller.py`
around lines 36 - 38, The single-image endpoint calls
extract_decoded_image_data(payload.image_data) and then
image_embedding_service.query_embed(image_data) without handling malformed
base64; wrap the call to extract_decoded_image_data in a try/except that catches
binascii.Error (same as the batch endpoint) and convert it into a user-facing
400 response (e.g., raise HTTPException(status_code=400, detail="Malformed
base64 image data") or return the same error object used by the batch path) so
malformed inputs return 400 instead of propagating to the global 500 handler.
…n only (#262) * Simply have image embedding generation only * Batch api for image embeddings * fix for simplified controller
Summary by CodeRabbit
New Features
Removed
Chores