Skip to content

[FEAT] minio presigned url for pdf#124

Merged
JonnyTran merged 6 commits into
developfrom
feat/minio-presigned-url-for-pdf
Aug 13, 2025
Merged

[FEAT] minio presigned url for pdf#124
JonnyTran merged 6 commits into
developfrom
feat/minio-presigned-url-for-pdf

Conversation

@JonnyTran
Copy link
Copy Markdown
Member

This pull request introduces several improvements and fixes across the backend and frontend, with a focus on enhancing document and file handling, S3/MinIO integration, and user experience. The most significant changes include the addition of presigned URL generation for document access, improved S3/MinIO bucket management during user creation, and better compatibility for S3 metadata. There are also updates to API schemas, Docker/MinIO configuration, and error handling in the frontend.

Document and File Handling Improvements

  • Added presigned URL generation for document access in the get_document endpoint, ensuring returned document URLs are valid and time-limited for secure access.
  • Updated file handling endpoints to support both MinIO and local file storage, with appropriate type annotations and conditional authorization logic. [1] [2] [3] [4] [5] [6] [7] [8]

S3/MinIO Integration and Compatibility

  • Enabled automatic MinIO/S3 bucket creation for each workspace when creating a user via the CLI, improving onboarding and reducing manual setup.
  • Added a utility to sanitize metadata for S3 compatibility, ensuring all string values are ASCII-safe, and integrated it into the PDF metadata model. [1] [2]
  • Updated Docker Compose and environment configuration to use the correct MinIO port (9001:9000) and enabled the proper network for the MinIO service. [1] [2]

API and Schema Updates

  • Enhanced the GET /documents endpoint to support multiple identifiers and return multiple documents in order, with improved error messages.
  • Updated the DocumentUpdate and DocumentListItem schemas to support a URL field and ensure returned URLs are always present and valid.

Frontend and Developer Experience

  • Improved error handling and logging in the dataset deletion view model, making debugging easier.
  • Added a null check to the avatar getter in the User entity to prevent runtime errors.
  • Added debug logging to the document retrieval use case for better traceability.

Changelog and Documentation

  • Updated the changelog to document all major changes, including new features and dependency updates.

These changes collectively improve the robustness, security, and usability of the system, particularly around document access and storage.

- Added console logging to `GetDocumentByIdUseCase` for improved debugging of document retrieval.
- Introduced a `limit` parameter in the `add_document` and `find_existing_documents` functions to control the number of results returned.
- Updated the `delete_documents` function to include type ignoring for compatibility.
- Refactored the `list_documents` function to streamline document validation and retrieval.
…ue with dockerd

- Updated `.env.dev` to reflect the new S3 endpoint at port 9001, ensuring compatibility with the local MinIO setup.
- Added error logging in `useDeleteDatasetViewModel` for better debugging during dataset deletion failures.
- Updated `User` entity to safely handle optional `userName` when generating avatar.
- Introduced presigned URL generation for document access in the `get_document` function, ensuring valid file URLs.
- Refactored document retrieval logic to support presigned URLs and improved metadata handling in document schemas.
- Enhanced file handling in the context to ensure compatibility with S3 storage requirements.
- Replaced the existing MinIO client dependency with singleton across various document and file handling endpoints.
- Updated the `CHANGELOG.md` to reflect the changes in endpoint behavior and client handling.
- Enhanced document retrieval logic to support multiple identifiers in a single request, improving API usability.
@JonnyTran JonnyTran requested review from a team as code owners August 13, 2025 01:43
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Aug 13, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 54.62963% with 49 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ralit-server/src/extralit_server/contexts/files.py 39.13% 28 Missing ⚠️
...it_server/api/schemas/v1/document/preprocessing.py 28.57% 10 Missing ⚠️
...lit-server/src/extralit_server/contexts/imports.py 71.42% 4 Missing ⚠️
...r/src/extralit_server/cli/database/users/create.py 75.00% 3 Missing ⚠️
...tralit_server/cli/database/users/create_default.py 75.00% 3 Missing ⚠️
...erver/src/extralit_server/api/handlers/v1/files.py 75.00% 1 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Flag Coverage Δ
extralit 66.38% <ø> (ø)
extralit-server 81.00% <54.62%> (-0.10%) ⬇️
frontend 10.59% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...r/src/extralit_server/api/handlers/v1/documents.py 53.92% <100.00%> (+5.77%) ⬆️
...er/src/extralit_server/api/schemas/v1/documents.py 100.00% <100.00%> (ø)
extralit-server/src/extralit_server/database.py 70.00% <ø> (ø)
...t-server/src/extralit_server/jobs/document_jobs.py 30.76% <ø> (ø)
...erver/src/extralit_server/api/handlers/v1/files.py 65.51% <75.00%> (+0.70%) ⬆️
...r/src/extralit_server/cli/database/users/create.py 85.71% <75.00%> (-3.18%) ⬇️
...tralit_server/cli/database/users/create_default.py 91.17% <75.00%> (-8.83%) ⬇️
...lit-server/src/extralit_server/contexts/imports.py 60.82% <71.42%> (+0.29%) ⬆️
...it_server/api/schemas/v1/document/preprocessing.py 54.54% <28.57%> (-45.46%) ⬇️
...ralit-server/src/extralit_server/contexts/files.py 24.11% <39.13%> (+1.58%) ⬆️

... and 3 files with indirect coverage changes

Components Coverage Δ
extralit 66.38% <ø> (ø)
extralit-server 81.00% <54.62%> (-0.10%) ⬇️
extralit-frontend 10.59% <ø> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JonnyTran JonnyTran merged commit 005fb91 into develop Aug 13, 2025
1 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants