Skip to content

609 implement native blob storage for azure gcp and aws#674

Merged
ryanweiler92 merged 29 commits intodevfrom
609-implement-native-blob-storage-for-azure-gcp-and-aws
May 12, 2025
Merged

609 implement native blob storage for azure gcp and aws#674
ryanweiler92 merged 29 commits intodevfrom
609-implement-native-blob-storage-for-azure-gcp-and-aws

Conversation

@shubhammahure
Copy link
Copy Markdown
Contributor

@shubhammahure shubhammahure commented Apr 10, 2025

Google Cloud, Azure, and AWS Native Storages - Implementation and Testing

Description

This implementation provides methods for managing files and directories in Google Cloud Storage (GCS), Azure Blob Storage, and AWS S3 with additional functionalities such as retry operations, rollback capabilities, and synchronization between local storage and cloud storage.

Methods:

  • list()
    Lists all objects in the configured storage bucket (GCS, Azure, or AWS S3).
  • listDetails()
    Lists all objects along with their metadata (size, creation date, etc.) from the configured storage.
  • syncLocalToStorage
    Synchronizes the contents of a local directory to a specified path in cloud storage, including:
    Removing empty directories locally.
    Deleting extra objects from the cloud storage that are not present locally.
    Uploading files and syncing.
  • syncStorageToLocal
    Synchronizes the contents of a cloud storage path to a specified local directory, including:
    Deleting empty objects from cloud storage.
    Syncing files to local storage.
    Deleting local files that are not present in cloud storage.
    Removing empty directories locally.
  • copyToStorage
    Copies a specific local file to the configured cloud storage:
    Removes empty local directories before upload.
    Uploads the file to the cloud storage.
    Removes empty objects from the cloud storage after upload.
  • copyToLocal
    Copies a specific file from the cloud storage to the local file system:
    Deletes empty objects (zero-byte files) from the cloud storage.
    Downloads the file to local storage.
    Deletes empty local directories after download.
  • deleteFromStorage(String storagePath)
    Deletes a file or object from the configured cloud storage (GCS, Azure, or AWS S3).
  • deleteFromStorage(String storagePath, boolean leaveFolderStructure)
    Deletes objects from the specified path but retains the folder structure if leaveFolderStructure is true.
  • deleteFolderFromStorage(String storageFolderPath)
    Recursively deletes all contents from the specified folder in the configured cloud storage (GCS, Azure, or AWS S3).

Additional Functionalities:

Retry Operations: Automatically retries failed upload, download, or delete operations up to 3 times.
Rollback Mechanism:
rollbackUpload: Reverts uploaded files if the operation fails.
rollbackDownload: Reverts downloaded files if the operation fails.
retryDelete: Retries deletion if it fails, up to 3 attempts.

Tracking:

Successfully processed files are added to uploadedFiles, downloadedFiles, and deletedFiles lists.
Failed operations are tracked in failedFiles.


How to Test (Using App Terminal)

Step 1: Initialize the Cloud Storage Engine

1) Google Cloud Storage
Use the following pixel code format to create a storage catalog and connect with Google Cloud Storage:

CreateStorageEngine( storage=["Any_Name"], storageDetails=[{"STORAGE_TYPE":"GOOGLE_CLOUD_NATIVE_STORAGE","NAME":"Any_Name","GCS_SERVICE_ACCOUNT_FILE":"Your_Account_Key","GCS_BUCKET":"Your_Bucket_Name","GCS_PROJECT_ID":"Your_Project_Id"}] )

2) Azure Blob Storage
Use the following pixel code format to create a storage catalog and connect with Azure Blob Storage:
CreateStorageEngine( storage=["Any_Name"], storageDetails=[{"STORAGE_TYPE":"MICROSOFT_AZURE_NATIVE_BLOB_STORAGE","NAME":"Any_Name","AZ_CONN_STRING":"Your_Account_Connection_String"}] )

3) AWS S3 Storage
Use the following pixel code format to create a storage catalog and connect with AWS S3 Storage:
CreateStorageEngine( storage=["Any_Name"], storageDetails=[{"STORAGE_TYPE":"AMAZON_S3_NATIVE","NAME":"Any_Name","S3_REGION":"Your_Account_Region","S3_BUCKET":"Your_Bucket_Name","S3_ACCESS":"Your_Account_AccessKey","S3_SECRET":"Your_Account_SecretKey"}] )

Step 2: Navigate to Storage Catalog

After successfully creating the storage engine, go inside the Storage Catalog and access the Usage Tab to interact with the Cloud storage.
usagetab

Step 3: Test Each Method

Use the above pixel code to test each method by providing the required paths as specified in the method signatures.

@shubhammahure shubhammahure linked an issue Apr 10, 2025 that may be closed by this pull request
3 tasks
@github-actions
Copy link
Copy Markdown

@CodiumAI-Agent /describe

@github-actions
Copy link
Copy Markdown

@CodiumAI-Agent /review

@QodoAI-Agent
Copy link
Copy Markdown

Title

609 implement native blob storage for azure gcp and aws


User description

Description

implement native blob storage for azure gcp and aws


PR Type

Enhancement


Description

  • Add AWS native blob storage engine.

  • Add Azure native blob storage engine.

  • Add Google Cloud native blob storage engine.

  • Update StorageTypeEnum with new storage type constants.

  • Add utility methods for path parsing in AbstractStorageEngine.

  • Update pom.xml with new dependencies.


Changes walkthrough 📝

Relevant files
Enhancement
StorageTypeEnum.java
Update StorageTypeEnum with native storage types.               

src/prerna/engine/api/StorageTypeEnum.java

  • Imported new storage engine classes.
  • Added enum constants for AMAZON_S3_NATIVE,
    GOOGLE_CLOUD_NATIVE_STORAGE, and MICROSOFT_AZURE_NATIVE_BLOB_STORAGE.
  • +6/-0     
    AWSNativeBlogStorageEngine.java
    Add AWS native blob storage engine.                                           

    src/prerna/engine/impl/storage/AWSNativeBlogStorageEngine.java

  • New file implementing AWS native blob storage.
  • Provides methods for open, list, sync, copy, delete with retry logic.
  • Implements rollback for failed operations.
  • +776/-0 
    AbstractStorageEngine.java
    Add path parsing utilities.                                                           

    src/prerna/engine/impl/storage/AbstractStorageEngine.java

  • Added parseLocalPaths method.
  • Added parseStorageObjectPaths method.
  • Updated utility functions for path handling.
  • +35/-0   
    AzureNativeBlobStorageEngine.java
    Add Azure native blob storage engine.                                       

    src/prerna/engine/impl/storage/AzureNativeBlobStorageEngine.java

  • New file implementing Azure native blob storage.
  • Provides sync, copy, list, and delete operations.
  • Integrates with Azure Blob Storage API.
  • +742/-0 
    GoogleCloudNativeBlobStorageEngine.java
    Add Google Cloud native blob storage engine.                         

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java

  • New file implementing Google Cloud native blob storage.
  • Implements file synchronization, copy, list, and delete functions.
  • Uses Google Cloud Storage API and handles metadata.
  • +723/-0 
    pom.xml
    Update pom.xml dependencies.                                                         

    pom.xml

  • Added dependency for google-cloud-storage.
  • Added dependency for azure-storage-blob.
  • Added dependency for AWS S3 SDK.
  • +15/-0   

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @github-actions
    Copy link
    Copy Markdown

    @CodiumAI-Agent /improve

    @QodoAI-Agent
    Copy link
    Copy Markdown

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Naming Inconsistency

    The new enum entry uses AWSNativeBlogStorageEngine which appears to be a typo. Consider renaming it to AWSNativeBlobStorageEngine to match the intended functionality.

    AMAZON_S3_NATIVE("AMAZON_S3_NATIVE", AWSNativeBlogStorageEngine.class.getName()),
    Comment Mismatch

    A comment in the syncLocalToStorage method refers to deleting extra blobs from azure storage. This seems to be a leftover from copy‐paste and should be updated for Google Cloud storage.

    syncStorageDeletion(storage, storagePath, localBasePath);

    @QodoAI-Agent
    Copy link
    Copy Markdown

    QodoAI-Agent commented Apr 10, 2025

    PR Code Suggestions ✨

    Latest suggestions up to 697b13f

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Add pagination to list

    The AWS S3 list operation currently only retrieves the first page of results. Use a
    loop with continuation tokens to page through all objects when the response is
    truncated.

    src/prerna/engine/impl/storage/AWSNativeBlogStorageEngine.java [98-112]

     @Override
     public List<String> list(String path) throws Exception {
    -    List<String> fileList = new ArrayList<String>();
    -    path = path.replace("\\", "/").replaceAll("/+", "/").replaceFirst("^/", "").replaceAll("/+$", "");
    -    try {
    -        ListObjectsV2Response listObjectsV2Response = s3ListObjectResponse(path);
    -        for (S3Object object : listObjectsV2Response.contents()) {
    -            fileList.add(object.key());
    +    List<String> fileList = new ArrayList<>();
    +    String prefix = path.replace("\\", "/").replaceAll("/+", "/").replaceFirst("^/", "").replaceAll("/+$", "");
    +    String token = null;
    +    do {
    +        ListObjectsV2Request.Builder req = ListObjectsV2Request.builder().bucket(bucket).prefix(prefix);
    +        if (token != null) req.continuationToken(token);
    +        ListObjectsV2Response res = client.listObjectsV2(req.build());
    +        for (S3Object obj : res.contents()) {
    +            fileList.add(obj.key());
             }
    -    } catch (S3Exception e) {
    -        classLogger.error(Constants.STACKTRACE, e);
    -    }
    +        token = res.nextContinuationToken();
    +    } while (token != null);
         return fileList;
     }
    Suggestion importance[1-10]: 9

    __

    Why: The current list method only retrieves the first page of S3 objects; iterating with continuationToken is essential to handle buckets with more than 1000 keys and avoid incomplete listings.

    High
    Roll back on partial upload failure

    If some uploads fail, the method proceeds without rolling back partial successes.
    After the walk completes, detect any failures, rollback all uploaded files, and fail
    fast.

    src/prerna/engine/impl/storage/AWSNativeBlogStorageEngine.java [164-172]

     Files.walk(localFilePath).filter(Files::isRegularFile).forEach(file -> {
         try {
             uploadedFiles.add(uploadFileToS3(storagePath, file, localBasePath, metadata));
         } catch (Exception e) {
             failedFiles.add(file.toString());
             classLogger.error("Failed to upload file:" + file, e);
         }
     });
     found = true;
    +if (!failedFiles.isEmpty()) {
    +    rollbackUploads(client, uploadedFiles.stream().filter(Objects::nonNull).collect(Collectors.toList()));
    +    throw new RuntimeException("Sync failed for files: " + failedFiles);
    +}
    Suggestion importance[1-10]: 7

    __

    Why: Introducing a check on failedFiles and calling rollbackUploads ensures the sync is atomic and avoids leaving partially uploaded files, improving error handling without overhauling the method.

    Medium
    Throw on file read failure

    Throw an exception instead of returning null when reading file properties fails, so
    that the error propagates and triggers a rollback instead of introducing null
    entries in the upload list.

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java [517-520]

     } catch (IOException e) {
         classLogger.error("Failed to read file properties: " + file, e);
    -    return null;
    +    throw new RuntimeException("Failed to read file properties: " + file, e);
     }
    Suggestion importance[1-10]: 7

    __

    Why: Returning null on an I/O error in uploadingFileToGCS can introduce null entries into the upload list and mask failures; throwing an exception ensures proper rollback.

    Medium
    Normalize storage prefix before deletion

    Normalize storagePath before passing it to syncStorageDeletion to prevent accidental
    deletion of unintended blobs when the prefix is empty or malformed.

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java [145]

    -syncStorageDeletion(storage, storagePath, localBasePath);
    +String normalizedStoragePath = storagePath.replace("\\", "/")
    +        .replaceAll("/+", "/")
    +        .replaceFirst("^/", "")
    +        .replaceAll("/+$", "");
    +syncStorageDeletion(storage, normalizedStoragePath, localBasePath);
    Suggestion importance[1-10]: 6

    __

    Why: Normalizing storagePath before calling syncStorageDeletion prevents unintended blob deletions when the prefix is empty or malformed, improving safety in syncLocalToStorage.

    Low
    General
    Fix typo in class name

    There's a typo in the class name "AWSNativeBlogStorageEngine" which should be
    "AWSNativeBlobStorageEngine" to match its purpose and avoid confusion. Rename the
    class and update all references accordingly.

    src/prerna/engine/api/StorageTypeEnum.java [3-18]

    -import prerna.engine.impl.storage.AWSNativeBlogStorageEngine;
    +import prerna.engine.impl.storage.AWSNativeBlobStorageEngine;
     ...
    -AMAZON_S3_NATIVE("AMAZON_S3_NATIVE", AWSNativeBlogStorageEngine.class.getName()),
    +AMAZON_S3_NATIVE("AMAZON_S3_NATIVE", AWSNativeBlobStorageEngine.class.getName()),
    Suggestion importance[1-10]: 8

    __

    Why: The enum and import reference AWSNativeBlogStorageEngine but the implementation is about a S3 blob, so renaming to AWSNativeBlobStorageEngine fixes a confusing typo across the API and implementation.

    Medium
    Restrict zero-byte deletions to folders

    Only delete zero-byte blobs that represent folder placeholders (names ending with
    “/”) to avoid removing legitimate empty files.

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java [705-707]

    -if (blob.getSize() == 0) { // Check if the blob is empty (zero-byte file)
    +if (blob.getSize() == 0 && blob.getName().endsWith("/")) {
         storage.delete(blob.getBlobId());
         classLogger.info("Deleted empty blob folder: " + blob.getName());
     }
    Suggestion importance[1-10]: 6

    __

    Why: Deleting all zero-byte blobs may remove legitimate empty files; adding a check for names ending with / ensures only folder placeholders are removed.

    Low
    Use safe update time getter

    Use the safer blob.getUpdateTime() method and guard against null to compare
    modification times without risking a NullPointerException.

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java [195]

    -long cloudModifiedTime = blob.getUpdateTimeOffsetDateTime().toInstant().toEpochMilli();
    +Long cloudModifiedTime = blob.getUpdateTime();
    +if (cloudModifiedTime == null) {
    +    cloudModifiedTime = 0L;
    +}
    Suggestion importance[1-10]: 4

    __

    Why: Switching from getUpdateTimeOffsetDateTime() to getUpdateTime() with a null guard reduces the chance of NullPointerException, though it's a minor safety improvement.

    Low

    Previous suggestions

    Suggestions up to commit 697b13f
    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Trigger retry on deletion failure

    Ensure the lambda throws an exception if deletion fails to trigger retry logic.

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java [554-563]

     try {
         retryOperation(() -> {
             boolean deleted = blob.delete();
    -        if (deleted) {
    -            classLogger.info("Deleted file: " + blobName);
    -            deletedFiles.add(blobName);
    +        if (!deleted) {
    +            throw new RuntimeException("Deletion failed for: " + blob.getName());
             }
    -    }, "Deleting file: " + blobName);
    +        classLogger.info("Deleted file: " + blob.getName());
    +        deletedFiles.add(blob.getName());
    +    }, "Deleting file: " + blob.getName());
     } catch (Exception e) {
    -    failedFiles.add(blobName);
    -    classLogger.error("Failed to delete file: " + blobName, e);
    +    failedFiles.add(blob.getName());
    +    classLogger.error("Failed to delete file: " + blob.getName(), e);
     }
    Suggestion importance[1-10]: 8

    __

    Why: This suggestion forces the lambda to throw an exception when deletion returns false, ensuring the retry mechanism is properly triggered and improving the robustness of the deletion logic.

    Medium
    Allow exceptions to trigger retries

    Remove the internal try-catch to let exceptions propagate and trigger retry
    mechanism.

    src/prerna/engine/impl/storage/GoogleCloudNativeBlobStorageEngine.java [689-696]

     retryOperation(() -> {
    -    try {
    -        storage.create(blobInfoBuilder.build(), Files.readAllBytes(file));
    -        classLogger.info("Uploaded file to GCS: " + blobName);
    -    } catch (IOException e) {
    -        classLogger.error("Failed to upload file to GCS: " + blobName, e);
    -    }
    +    storage.create(blobInfoBuilder.build(), Files.readAllBytes(file));
    +    classLogger.info("Uploaded file to GCS: " + blobName);
     }, "Uploading: " + blobName);
    Suggestion importance[1-10]: 8

    __

    Why: By removing the internal try-catch, exceptions will propagate as intended to the retryOperation handler, thereby ensuring that failures in file upload trigger retries, which is a significant improvement for error handling.

    Medium

    @ryanweiler92 ryanweiler92 removed the request for review from themaherkhalil May 7, 2025 14:50
    @ryanweiler92
    Copy link
    Copy Markdown
    Collaborator

    @shubhammahure Can you please add a description of your changes and a section on how to test your changes to the PR description?

    @shubhammahure
    Copy link
    Copy Markdown
    Contributor Author

    description has been updated to include support for GCS, Azure, and AWS, and it is applicable to all PRs related to storages.

    @ppatel9703
    Copy link
    Copy Markdown
    Contributor

    @shubhammahure Can we move most of the reactors to utilize the normalizePath or would that not work?

    I've test this on a s3 portion of this and it works. Will mark as LGTM once the above comment is resolved

    @shubhammahure
    Copy link
    Copy Markdown
    Contributor Author

    @ppatel9703, I have completed the refactoring of the path normalization logic as per the review.

    @ryanweiler92 ryanweiler92 merged commit 08e2ac8 into dev May 12, 2025
    3 checks passed
    @ryanweiler92 ryanweiler92 deleted the 609-implement-native-blob-storage-for-azure-gcp-and-aws branch May 12, 2025 16:32
    @github-actions
    Copy link
    Copy Markdown

    @CodiumAI-Agent /update_changelog

    @QodoAI-Agent
    Copy link
    Copy Markdown

    Changelog updates: 🔄

    2025-05-12 *

    Added

    • Native blob storage engines for AWS S3, Azure Blob Storage, and Google Cloud Storage
    • File sync, listing, copy and delete operations with retry and rollback support

    to commit the new content to the CHANGELOG.md file, please type:
    '/update_changelog --pr_update_changelog.push_changelog_changes=true'

    This was referenced May 13, 2025
    manamittal added a commit that referenced this pull request May 20, 2025
    * fix(python): handle eval when it is a single line execution but there is string input with space (#756)
    
    * Update Dockerfile.tomcat (#757)
    
    * fix: tomcat builder setting env var
    
    * fix: updating tomcat to 9.0.104
    
    * Update Dockerfile.ubuntu22.04
    
    * Update Dockerfile.ubuntu22.04
    
    * Update Dockerfile.ubuntu22.04
    
    * feat: creating KubernetesModelScaler class (#763)
    
    * Update Dockerfile.ubuntu22.04
    
    * feat: adding ability to attach a file to a vector db source (#736)
    
    * Added AttachSourceToVectorDbReactor for uploading pdf file to an existing csv file and modified VectorFileDownloadReactor
    
    * fix: proper return for the download and matching the reactor name
    
    * fix: error for downloading single file vs multiple; error for copyToDirectory instead of copyFile
    
    * chore: renaming so reactor matches VectorFileDownload
    
    ---------
    
    Co-authored-by: Maher Khalil <themaherkhalil@gmail.com>
    
    * Update Dockerfile.ubuntu22.04
    
    * Update ubuntu2204.yml
    
    * Update ubuntu2204.yml
    
    * Update ubuntu2204_cuda.yml
    
    * Update Dockerfile.nvidia.cuda.12.5.1.ubuntu22.04
    
    * Update ubuntu2204_cuda.yml
    
    * Update ubuntu2204.yml
    
    * feat: exposing tools calling through models (#764)
    
    * 587 unit test for prernadsutil (#654)
    
    * test(unit): unit tests for the prerna.util.ds package
    
    * test(unit): unit tests for the prerna.util.ds.flatfile package
    
    * test(unit): removed reflections, added paraquet tests
    
    * test(unit): unit tests for the prerna.util.ds package
    
    * test(unit): unit tests for the prerna.util.ds.flatfile package
    
    * test(unit): removed reflections, added paraquet tests
    
    * Update ubuntu2204.yml
    
    * Update ubuntu2204.yml
    
    * Update ubuntu2204.yml
    
    * fix: update pipeline docker buildx version
    
    * fix: ignore buildx
    
    * fix: adjusting pipeline for cuda
    
    * feat: switching dynamic sas to default false (#766)
    
    * fix: changes to account for version 2.0.0 of pyjarowinkler (#769)
    
    * chore: using 'Py' instead of 'py' to be consistent (#770)
    
    * feat: full ast parsing of code to return evaluation of the last expression (#771)
    
    * Python Deterministic Token Trimming for Message Truncation (#765)
    
    * feat: deterministic-token-trimming
    
    * feat: modifying logic such that system prompt is second to last message for truncation
    
    ---------
    
    Co-authored-by: Maher Khalil <themaherkhalil@gmail.com>
    
    * fix: added date added column to enginepermission table (#768)
    
    * fix: add docker-in-docker container to run on sef-hosted runner (#773)
    
    Co-authored-by: Raul Esquivel <resmas.work@gmail.com>
    
    * fix: properly passing in the parameters from kwargs/smss into model limits calculation (#774)
    
    * fix: removing legacy param from arguments (#777)
    
    * fix: Fix docker cache build issue (#778)
    
    * adding no cache
    
    * adding no cache
    
    * feat: Adding Semantic Text Splitting & Token Text Splitting (#720)
    
    * [696] - build - Add chonky semantic text splitting - Added the function for chonky semantic text splitting and integrated with existing flow.
    
    * [696] - build - Add chonky semantic text splitting - Updated the code
    
    * [696] - build - Add chonky semantic text splitting - Updated the code comments
    
    * feat: adding reactor support through java
    
    * feat: updating pyproject.toml with chonky package
    
    * feat: check for default chunking method in smss
    
    * [696] - feat - Add chonku semantic text splitting - Resolved the conflicts
    
    * [696] - feat - Add chonky semantic text splitting - Organized the code.
    
    * feat: adding chunking by tokens and setting as default
    
    * updating comments on chunking strategies
    
    ---------
    
    Co-authored-by: Weiler, Ryan <ryanweiler92@gmail.com>
    Co-authored-by: kunal0137 <kunal0137@gmail.com>
    
    * feat: allowing for tools message in full prompt (#780)
    
    * UPDATE ::: Add docker in docker Dockerfiler (#784)
    
    * add docker in docker Dockerfile
    
    * Update Dockerfile.dind
    
    Remove python and tomcat arguments from Dockerfile
    
    * fix: remove-paddle-ocr (#786)
    
    * [#595] test(unit): adds unit test for prerna.engine.impl.model.kserve
    
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * feat: Tag semoss image (#789)
    
    * adding changes for non-release docker build
    
    * adding non-release build logic to cuda-semoss builder
    
    * updating push branches
    
    * fix: branch names on docker builds
    
    * fix: branch names on docker builds cuda
    
    * fix: adding push condition - change to pyproject toml file; adding event input vars to env vars (#790)
    
    * fix: python builder toml file change (#792)
    
    * fix: Catch errors when calling pixels from Python (#787)
    
    Co-authored-by: Weiler, Ryan <ryanweiler92@gmail.com>
    
    * Creating db links between engines and default apps (#693)
    
    * create db links between engine and default app
    
    * Rename column APPID to TOOL_APP
    
    * feat: add database_tool_app to getUserEngineList
    
    ---------
    
    Co-authored-by: Weiler, Ryan <ryanweiler92@gmail.com>
    
    * Adding sort options to the myengines reactor (#479)
    
    * added sort feature to MyEnginesReactor and genericized reactor imports
    
    * formatting
    
    * overloading method
    
    * validate sortList
    
    ---------
    
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * feat: cleaning up unused imports in MyEngine reactor (#793)
    
    * feat: Create Enum projectTemplate and update CreateAppFromTemplateReactor to accept existing appID for cloning applications (#621)
    
    Co-authored-by: kunal0137 <kunal0137@gmail.com>
    
    * Update GetEngineUsageReactor.java (#417)
    
    Co-authored-by: Maher Khalil <themaherkhalil@gmail.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * Issue 596: Adds Unit Tests for prerna/engine/impl/model/responses and workers (#727)
    
    * [#596] test(unit): adds unit tests
    
    * fix: implements ai-agents suggestions
    
    ---------
    
    Co-authored-by: Jeff Vitunac <jvitunac@gmail.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * 609 implement native blob storage for azure gcp and aws (#674)
    
    * Initial commit : implementation for azure blob storage
    
    * added dependency for azure in pom.xml
    
    * update logic to fetch the metadata from list details
    
    * changed functionality from listing containers to listing files within a selected container
    
    * initial commit for google cloud storage implementation
    
    * added field contant in enum class and removed unused method
    
    * add methods to parse comma-separated local and cloud paths
    
    * add methods to parse comma-separated local and cloud paths
    
    * implementation for aws s3 bucket
    
    * normalize container prefix path
    
    * merged all: implementation for azure, aws and gcp
    
    * refactor(storage): replace manual path normalization with normalizePath from Utility class
    
    ---------
    
    Co-authored-by: pvijayaraghavareddy <pvijayaraghavareddy@WORKSPA-6QV71G7.us.deloitte.com>
    Co-authored-by: Parth <parthpatel3@deloitte.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * Get Node Pool Information for Remote Models (#806)
    
    * 590 unit test for prernaengineimpl (#808)
    
    * test(unit): update to filesystems hijacking for testing files
    
    * test: start of unit tests for abstract database engine
    
    * test(unit): added unit test for prerna.engine.impl
    
    * test(unit): finsihed tests for prerna.engine.impl
    
    * test(unit): adding back unused assignment
    
    ---------
    
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * Creating WordCountTokenizer Class (#802)
    
    * feat: creating word count tokenizer class && falling back to word count tokenizer if tiktok fails
    
    * feat: updating comment
    
    * feat: setting default chunking method as recursive (#810)
    
    * Unit tests fixes and Unit test Class file location updates (#812)
    
    * test(unit): moved tests to correct packages
    
    * test(unit): fixed a couple of unit tests
    
    * VectorDatabaseQueryReactor: output divider value for word doc chunks always 1 (#804)
    
    * Code implementation for #733
    
    * feat: Added code to resolve Divider page issue
    
    * Console output replaced by LOGGERs as per review comments
    
    * feat: replaced Console with Loggers
    
    ---------
    
    Co-authored-by: Varaham <katchabi50@gmail.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * GetCurrentUserReactor (#818)
    
    Adding GetCurrentUserReactor to return user info including if user is an admin.
    
    * Python User Class (#819)
    
    * fix: trimming properties read from smss; fix: logging commands before executing (#821)
    
    * Updating getNodePoolsInfo() to parse and return zk info and models active actual (#822)
    
    * feat: update get node pool information for zk info and models active actual
    
    * feat: get remote model configs
    
    * Add unit tests for package prerna\engine\impl\vector (#728)
    
    * Create ChromaVectorDatabaseEngineUnitTests.java
    
    * completed tests for ChromaVectorDatabaseEngine class
    
    * [#604] test(unit): Created ChromaVectorDatabaseEngine unit tests
    
    * [604] tests(unit) : Completed test cases for ChromaVectorDatabaseEngine; update File operations to nio operations in ChromaVectorDatabaseEngine.java
    
    * [#604] tests(unit): added unit tests for all vector database engines and util classes in the prerna\engine\impl\vector package
    
    * [604] test(unit): replaced creating file paths with string literals with java.nio Paths.resolve/Paths.get methods
    
    ---------
    
    Co-authored-by: Maher Khalil <themaherkhalil@gmail.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * feat: adding to the return of getenginemetadata (#813)
    
    * feat: adding to the return of getenginemetadata
    
    * fix: removing throws
    
    ---------
    
    Co-authored-by: Arash Afghahi <48933336+AAfghahi@users.noreply.github.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    
    * 718 create a single reactor to search both engines and apps (#794)
    
    * feat(engineProject): Initial commit
    
    * chore: 718 create a single reactor to search both engines and apps
    
    * chore: 718 create a single reactor to search both engines and apps
    
    ---------
    
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    Co-authored-by: Vijayaraghavareddy <pvijayaraghavareddy@deloitte.com>
    
    * feat: update openai wrapper to handle multiple images (#832)
    
    * feat: adding user room map (#840)
    
    * feat: hiding side menu bar for non admins (#833)
    
    * Side menu changes
    
    * Review Comments fixed
    
    * Flag is renamed in  Constants.java
    
    * Review Comment fixed in Utility.java
    
    * fix: cleaning up defaults and comments
    
    ---------
    
    Co-authored-by: kunal0137 <kunal0137@gmail.com>
    
    ---------
    
    Co-authored-by: Maher Khalil <themaherkhalil@gmail.com>
    Co-authored-by: kunal0137 <kunal0137@gmail.com>
    Co-authored-by: Ryan Weiler <ryanweiler92@gmail.com>
    Co-authored-by: ManjariYadav2310 <manjayadav@deloitte.com>
    Co-authored-by: dpartika <dpartika@deloitte.com>
    Co-authored-by: Raul Esquivel <resmas.work@gmail.com>
    Co-authored-by: Pasupathi Muniyappan <pasupathi.muniyappan@kanini.com>
    Co-authored-by: resmas-tx <131498457+resmas-tx@users.noreply.github.com>
    Co-authored-by: AndrewRodddd <62724891+AndrewRodddd@users.noreply.github.com>
    Co-authored-by: radkalyan <107957324+radkalyan@users.noreply.github.com>
    Co-authored-by: samarthKharote <samarth.kharote@kanini.com>
    Co-authored-by: Shubham Mahure <shubham.mahure@kanini.com>
    Co-authored-by: rithvik-doshi <81876806+rithvik-doshi@users.noreply.github.com>
    Co-authored-by: Mogillapalli Manoj kumar <86736340+Khumar23@users.noreply.github.com>
    Co-authored-by: Jeff Vitunac <jvitunac@gmail.com>
    Co-authored-by: pvijayaraghavareddy <pvijayaraghavareddy@WORKSPA-6QV71G7.us.deloitte.com>
    Co-authored-by: Parth <parthpatel3@deloitte.com>
    Co-authored-by: KT Space <119169984+Varaham@users.noreply.github.com>
    Co-authored-by: Varaham <katchabi50@gmail.com>
    Co-authored-by: ericgonzal8 <ericgonzalez8@deloitte.com>
    Co-authored-by: Arash Afghahi <48933336+AAfghahi@users.noreply.github.com>
    Co-authored-by: Vijayaraghavareddy <pvijayaraghavareddy@deloitte.com>
    Co-authored-by: ammb-123 <ammb@deloitte.com>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    Implement Native Blob Storage for Azure, GCP, and AWS

    4 participants