Implement authorization framework with caching and metrics support #9

peterxcli · 2025-10-08T08:02:45Z

What changes were proposed in this pull request?

Provide a one-liner summary of the changes in the PR Title field above.
It should be in the form of HDDS-1234. Short summary of the change.

Please describe your PR in detail:

What changes are proposed in the PR? and Why? It would be better if it is written from third person's
perspective not just for the reviewer.
Provide as much context and rationale for the pull request as possible. It could be copy-paste from
the Jira's description if the jira is well defined.
If it is complex code, describe the approach used to solve the issue. If possible attach design doc,
issue investigation, github discussion, etc.

Examples of well-written pull requests:

What is the link to the Apache JIRA

Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)

(Please replace this section with the link to the Apache JIRA)

How was this patch tested?

(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)

cursor

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2025-10-08T08:04:15Z

...ozone-manager/src/main/java/org/apache/hadoop/ozone/security/acl/OzoneAuthorizerFactory.java

+            authorizer.getClass().getName(), e);
+      }
+    }
+


Bug: Plugin Lifecycle Issues Cause Resource Leaks

The new plugin lifecycle management calls IAccessAuthorizerPlugin.start() but lacks a corresponding stop() during shutdown, leading to resource leaks. The start() call also happens prematurely, before OzoneManagerAuthorizer.configure() or SharedTmpDirAuthorizer wrapping. This can result in an uninitialized authorizer if configure() returns a new instance, or a started plugin that is wrapped and not properly stopped.

cursor · 2025-10-08T08:04:16Z

...op-ozone/common/src/main/java/org/apache/hadoop/ozone/security/acl/BaseAuthorizerPlugin.java

+    if (cacheEnabled && decisionCache != null) {
+      decisionCache.asMap().keySet().removeIf(
+          key -> key.getResourcePath().startsWith(resourcePath));
+    }


Bug: Cache TTL Ignored, Over-Invalidation Risk

The put method in the authorization decision cache doesn't honor the ttlMillis parameter, always applying the default cache TTL, which limits setting specific expiration times. Additionally, invalidateResource uses startsWith for path matching, potentially causing unintended over-invalidation of cache entries.

gemini-code-assist · 2025-10-08T08:04:34Z

Summary of Changes

Hello @peterxcli, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new, extensible authorization framework for Ozone. It includes interfaces and base classes for pluggable authorizers, a dedicated caching mechanism for authorization decisions, and a comprehensive metrics system to monitor authorization performance and activity. The framework is designed to support modern IAM features and external authorization systems, providing lifecycle management, health checks, and configurable caching with detailed statistics.

Highlights

Intent: This pull request introduces a comprehensive authorization framework for Ozone, designed to support pluggable external authorization systems. The primary goal is to enhance performance and observability by integrating caching mechanisms and detailed metrics collection into the authorization process.
Key Changes: The PR adds new interfaces and base classes for authorization plugins, a dedicated authorization decision cache, and a metrics system. It also modifies the authorizer factory to properly initialize these new plugin types.
New Components: - "AuthorizationDecisionCache.java": Defines an interface for caching authorization decisions, including methods for getting, putting, invalidating by resource/subject/tenant, clearing, and retrieving statistics. It also includes nested classes for CacheKey (covering subject, resource, action, tenant, session token) and CacheStats.

"AuthorizerMetrics.java": Introduces a new metrics source for tracking authorization operations. It collects metrics such as total checks, allowed/denied decisions, errors, latency, cache hits/misses, and per-action/per-resource check counts.
"IAccessAuthorizerPlugin.java": An extended interface for pluggable authorization implementations, building upon IAccessAuthorizer. It adds lifecycle management (start, stop), health check capabilities (isHealthy, getStatus), metrics retrieval (getMetrics), and an optional policy refresh mechanism.
"BaseAuthorizerPlugin.java": An abstract base class that implements IAccessAuthorizerPlugin and AuthorizationDecisionCache. It provides common functionality like configurable decision caching (using Guava Cache), metrics collection, and basic lifecycle management. Subclasses only need to implement the core doStart, doCheckAccess, and doStop logic.
"OzoneAuthorizationContext.java": A new, extended authorization context class that wraps the existing RequestContext. It adds support for modern IAM features by including fields for tenantId, sessionToken, and a flexible map of custom attributes (e.g., OIDC claims).

Modified Components: - "OzoneAuthorizerFactory.java": Modified to detect and properly initialize IAccessAuthorizerPlugin implementations by calling their start() method during authorizer creation, ensuring that plugins with lifecycle requirements are correctly set up.
Benefits: This framework provides a standardized way to integrate advanced authorization logic, improves performance by caching authorization decisions, and offers better visibility into authorization activity through comprehensive metrics. It also simplifies the development of custom authorizers by providing a robust base class with built-in features.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a solid framework for pluggable authorization, including caching and metrics. The new interfaces and base classes are well-designed. I've identified a few areas for improvement, mainly concerning correctness in metrics collection, cache implementation details, and a potential resource leak. Please see my detailed comments below.

gemini-code-assist · 2025-10-08T08:07:13Z

...ozone-manager/src/main/java/org/apache/hadoop/ozone/security/acl/OzoneAuthorizerFactory.java

+    if (authorizer instanceof IAccessAuthorizerPlugin) {
+      try {
+        ((IAccessAuthorizerPlugin) authorizer).start(conf);
+        LOG.info("Started authorizer plugin: {}", authorizer.getClass().getName());
+      } catch (Exception e) {
+        LOG.error("Failed to start authorizer plugin", e);
+        throw new RuntimeException("Failed to start authorizer plugin: " +
+            authorizer.getClass().getName(), e);
+      }
+    }


This logic correctly handles the startup of IAccessAuthorizerPlugin instances by calling start(). However, there is no corresponding call to stop() (or close()) for these plugins when the OzoneManager shuts down.

The IAccessAuthorizerPlugin interface extends Closeable, which implies that implementations may hold resources (like network connections or thread pools) that must be released. Failing to call stop() will lead to resource leaks.

Please ensure that the OzoneManager's shutdown sequence is updated to call stop() on the authorizer if it is an instance of IAccessAuthorizerPlugin.

gemini-code-assist · 2025-10-08T08:07:14Z

...op-ozone/common/src/main/java/org/apache/hadoop/ozone/security/acl/BaseAuthorizerPlugin.java

+  @Override
+  public void put(CacheKey key, boolean decision, long ttlMillis) {
+    if (cacheEnabled && decisionCache != null) {
+      decisionCache.put(key, decision);


The put method implementation ignores the ttlMillis parameter. The cache is configured with a single expireAfterWrite duration for all entries. This contradicts the design of the AuthorizationDecisionCache interface, which is intended to support per-entry TTLs. The interface's Javadoc even suggests using different TTLs for 'allow' and 'deny' decisions, which is a common and useful pattern.

To properly implement per-entry TTL with Guava Cache, you would need to use CacheBuilder.expireAfter(Expiry). Since this implementation doesn't honor the interface contract, it could lead to unexpected caching behavior.

gemini-code-assist · 2025-10-08T08:07:14Z

...op-ozone/common/src/main/java/org/apache/hadoop/ozone/security/acl/BaseAuthorizerPlugin.java

+  private final AtomicLong cacheHits = new AtomicLong(0);
+  private final AtomicLong cacheMisses = new AtomicLong(0);
+  private final AtomicLong cacheEvictions = new AtomicLong(0);


These AtomicLong counters for cache statistics are redundant. The code is already using AuthorizerMetrics for the Hadoop metrics system, and the Guava cache is configured with .recordStats(), which provides its own, more efficient and reliable statistics via decisionCache.stats().

Using three different ways to track cache stats increases complexity and can lead to inconsistencies.

I recommend removing these AtomicLong fields and refactoring getMetrics() and getStats() to use decisionCache.stats() for hit, miss, and eviction counts. This would also require removing the manual increments in checkAccess().

gemini-code-assist · 2025-10-08T08:07:14Z

...op-ozone/common/src/main/java/org/apache/hadoop/ozone/security/acl/BaseAuthorizerPlugin.java

+          .maximumSize(cacheSize)
+          .expireAfterWrite(cacheTtlMs, TimeUnit.MILLISECONDS)
+          .recordStats()
+          .removalListener(notification -> cacheEvictions.incrementAndGet())


The current implementation of the removalListener incorrectly counts all removals as evictions. The cacheEvictions counter is incremented for any removal, including explicit invalidations (invalidate()/invalidateAll()), not just for evictions due to size, expiration, or garbage collection.

To accurately count only evictions, you should check notification.wasEvicted().

.removalListener(notification -> { if (notification.wasEvicted()) { cacheEvictions.incrementAndGet(); } })

gemini-code-assist · 2025-10-08T08:07:14Z

...op-ozone/common/src/main/java/org/apache/hadoop/ozone/security/acl/BaseAuthorizerPlugin.java

+      decisionCache.asMap().keySet().removeIf(
+          key -> key.getResourcePath().startsWith(resourcePath));


The cache invalidation logic in invalidateResource, invalidateSubject, and invalidateTenant involves iterating over the entire key set of the cache (decisionCache.asMap().keySet().removeIf(...)). This is an O(N) operation, where N is the cache size. For a large cache, this can be slow and may impact performance by blocking other operations, especially if policy or user updates are frequent.

For a more scalable solution, you might consider maintaining secondary indexes to allow for faster lookups of keys to invalidate.

…fecycle of access authorizer plugins. Added closing logic for Closeable authorizers in OzoneManager and improved configuration handling in OzoneAuthorizerFactory.

…ry TTL for authorization decisions. Introduced CachedDecision class to manage expiration and updated cache retrieval logic to invalidate expired entries. Enhanced resource invalidation to handle hierarchical matches.

github-actions · 2025-11-14T01:09:52Z

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

Implement authorization framework with caching and metrics support

7caf81a

peterxcli marked this pull request as draft October 8, 2025 08:02

cursor bot reviewed Oct 8, 2025

View reviewed changes

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

peterxcli added 3 commits October 8, 2025 16:21

Enhance OzoneManager and OzoneAuthorizerFactory to properly manage li…

7455406

…fecycle of access authorizer plugins. Added closing logic for Closeable authorizers in OzoneManager and improved configuration handling in OzoneAuthorizerFactory.

commit all left over

404dd6d

github-actions bot added the stale label Nov 14, 2025

peterxcli closed this Nov 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement authorization framework with caching and metrics support #9

Implement authorization framework with caching and metrics support #9

Uh oh!

peterxcli commented Oct 8, 2025

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Oct 8, 2025

Uh oh!

cursor bot Oct 8, 2025

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		decisionCache.asMap().keySet().removeIf(
		key -> key.getResourcePath().startsWith(resourcePath));

Implement authorization framework with caching and metrics support #9

Implement authorization framework with caching and metrics support #9

Uh oh!

Conversation

peterxcli commented Oct 8, 2025

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

cursor bot Oct 8, 2025

Choose a reason for hiding this comment

Bug: Plugin Lifecycle Issues Cause Resource Leaks

Uh oh!

cursor bot Oct 8, 2025

Choose a reason for hiding this comment

Bug: Cache TTL Ignored, Over-Invalidation Risk

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants