[core] feat: add multi-cloud storage support for benchmark results by slin1237 · Pull Request #127 · ome-projects/ome

slin1237 · 2025-07-03T20:56:00Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds multi-cloud storage support to BenchmarkJob, enabling users to store benchmark results in their preferred cloud storage provider.
Previously, only OCI Object Storage was supported.

The implementation extends the storage package to support:

AWS S3 (s3://bucket/prefix)
Azure Blob Storage (az://account/container/blob)
Google Cloud Storage (gs://bucket/object)
GitHub Releases (github://owner/repo@tag)

This aligns with genai-bench's new multi-cloud capabilities and allows users to integrate benchmark results storage with their existing cloud
infrastructure.

Changes:

Storage Package (pkg/utils/storage/):
- Added parse functions for S3, Azure, GCS, and GitHub storage URIs
- Added validation functions for each storage type
- Updated GetStorageType() and ValidateStorageURI() to handle new types
Benchmark Utils (pkg/controller/v1beta1/benchmark/utils/):
- Extended BuildStorageArgs() to generate correct CLI arguments for genai-bench
- Added support for provider-specific authentication parameters
Documentation (site/content/en/docs/concepts/benchmark.md):
- Added examples for each storage provider
- Documented URI formats and authentication options
- Added best practices for multi-cloud storage
- Added link to official genai-bench documentation
Tests:
- Added comprehensive test coverage for all new storage types
- Updated existing tests to reflect new functionality

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

The implementation follows the URI patterns and CLI arguments documented in genai-bench's multi-cloud documentation
All existing OCI storage functionality remains unchanged for backward compatibility
The PR includes extensive test coverage with all tests passing
Documentation has been updated with clear examples for each storage provider

Does this PR introduce a user-facing change?

Add multi-cloud storage support for BenchmarkJob results. Users can now store benchmark results in AWS S3, Azure Blob Storage, Google Cloud Storage,
 and GitHub Releases in addition to OCI Object Storage.

- Add support for AWS S3, Azure Blob, GCS, and GitHub storage providers - Extend storage package with parse and validation functions for each provider - Update BuildStorageArgs to generate appropriate CLI arguments for genai-bench - Add comprehensive test coverage for all storage types - Update benchmark documentation with multi-cloud examples and best practices

gemini-code-assist

Summary of Changes

Hello @slin1237, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the BenchmarkJob functionality by introducing robust multi-cloud storage support for benchmark results. It allows users to store their benchmark outputs in various popular cloud providers like AWS S3, Azure Blob Storage, Google Cloud Storage, and even GitHub Releases, in addition to the existing OCI Object Storage. The changes involve core logic for parsing and validating storage URIs, generating appropriate command-line arguments for the underlying genai-bench tool, and updating user-facing documentation with comprehensive examples and best practices.

Highlights

Multi-Cloud Storage Expansion: Expanded BenchmarkJob's result storage capabilities beyond OCI Object Storage to include AWS S3, Azure Blob Storage, Google Cloud Storage, and GitHub Releases, aligning with genai-bench's multi-cloud support.
Storage URI Parsing and Validation: Introduced new utility functions and data structures within the pkg/utils/storage package to parse and validate URIs for S3, Azure, GCS, and GitHub storage, ensuring correct format and component extraction.
Dynamic CLI Argument Generation: Enhanced the BuildStorageArgs function in pkg/controller/v1beta1/benchmark/utils to dynamically generate appropriate genai-bench CLI arguments, including provider-specific authentication parameters, based on the selected storage type.
Comprehensive Documentation Updates: Updated the site/content/en/docs/concepts/benchmark.md documentation with detailed examples, URI formats, and authentication options for all newly supported cloud storage providers, along with best practices for multi-cloud usage.
Improved Benchmark Job Status Logic: Adjusted the status reconciliation logic in pkg/controller/v1beta1/benchmark/controller.go to prioritize 'Failed' status over 'Completed' for BenchmarkJobs, ensuring accurate reporting when a job completes with errors.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces multi-cloud storage support for benchmark results, enhancing the capabilities of BenchmarkJob. The changes include updates to utility functions, tests, and documentation. I have provided suggestions to improve maintainability and documentation clarity.

gemini-code-assist · 2025-07-03T20:57:53Z

+		if storageSpec.Parameters != nil {
+			params := *storageSpec.Parameters
+			// Check if account name is provided in parameters
+			if accountName, ok := params["azure_account_name"]; ok {
+				args = append(args, "--storage-azure-account-name", accountName)
+			} else {
+				args = append(args, "--storage-azure-account-name", components.AccountName)
+			}
+			// Azure credentials
+			if accountKey, ok := params["azure_account_key"]; ok {
+				args = append(args, "--storage-azure-account-key", accountKey)
+			}
+			if connString, ok := params["azure_connection_string"]; ok {
+				args = append(args, "--storage-azure-connection-string", connString)
+			}
+			if sasToken, ok := params["azure_sas_token"]; ok {
+				args = append(args, "--storage-azure-sas-token", sasToken)
+			}
+		} else {
+			// Even without parameters, we need to add the account name
+			args = append(args, "--storage-azure-account-name", components.AccountName)
+		}


The logic for handling the Azure account name is a bit redundant. It can be simplified to improve readability and reduce code duplication. You can determine the account name to use first (from parameters or the URI) and then handle the other credential parameters.

// Handle storage parameters for Azure. The account name is required and will be // taken from parameters if present, otherwise it's taken from the URI. accountName := components.AccountName if storageSpec.Parameters != nil { params := *storageSpec.Parameters if name, ok := params["azure_account_name"]; ok { accountName = name } // Azure credentials if accountKey, ok := params["azure_account_key"]; ok { args = append(args, "--storage-azure-account-key", accountKey) } if connString, ok := params["azure_connection_string"]; ok { args = append(args, "--storage-azure-connection-string", connString) } if sasToken, ok := params["azure_sas_token"]; ok { args = append(args, "--storage-azure-sas-token", sasToken) } } args = append(args, "--storage-azure-account-name", accountName)

gemini-code-assist · 2025-07-03T20:57:53Z

+  storageUri: "az://myaccount/mycontainer/path/to/results"
+  # Or: "az://myaccount.blob.core.windows.net/mycontainer/path/to/results"
+  parameters:
+    azure_account_name: "myaccount"  # Optional: Storage account name


The comment for azure_account_name states that it's "Optional", which could be misleading for users. The storage account name is a required piece of information for Azure Blob Storage. While it's optional to specify in the parameters block, that's only because it can be parsed from the storageUri. To avoid confusion, clarify that the account name must be provided either in the URI or in the parameters.

Suggested change

azure_account_name: "myaccount" # Optional: Storage account name

azure_account_name: "myaccount" # Required: Azure storage account name

) - Add support for AWS S3, Azure Blob, GCS, and GitHub storage providers - Extend storage package with parse and validation functions for each provider - Update BuildStorageArgs to generate appropriate CLI arguments for genai-bench - Add comprehensive test coverage for all storage types - Update benchmark documentation with multi-cloud examples and best practices

slin1237 requested a review from CatherineSue as a code owner July 3, 2025 20:56

github-actions Bot added the feature label Jul 3, 2025

gemini-code-assist Bot reviewed Jul 3, 2025

View reviewed changes

slin1237 merged commit dd04792 into main Jul 3, 2025
23 checks passed

zhyncs deleted the slin/benchmark-update branch July 4, 2025 02:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] feat: add multi-cloud storage support for benchmark results#127

[core] feat: add multi-cloud storage support for benchmark results#127
slin1237 merged 1 commit into
mainfrom
slin/benchmark-update

slin1237 commented Jul 3, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jul 3, 2025

Uh oh!

gemini-code-assist Bot Jul 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	azure_account_name: "myaccount" # Optional: Storage account name
	azure_account_name: "myaccount" # Required: Azure storage account name

Conversation

slin1237 commented Jul 3, 2025

What type of PR is this?

What this PR does / why we need it:

Changes:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant