-
Notifications
You must be signed in to change notification settings - Fork 0
Implement authorization framework with caching and metrics support #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| authorizer.getClass().getName(), e); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Plugin Lifecycle Issues Cause Resource Leaks
The new plugin lifecycle management calls IAccessAuthorizerPlugin.start() but lacks a corresponding stop() during shutdown, leading to resource leaks. The start() call also happens prematurely, before OzoneManagerAuthorizer.configure() or SharedTmpDirAuthorizer wrapping. This can result in an uninitialized authorizer if configure() returns a new instance, or a started plugin that is wrapped and not properly stopped.
| if (cacheEnabled && decisionCache != null) { | ||
| decisionCache.asMap().keySet().removeIf( | ||
| key -> key.getResourcePath().startsWith(resourcePath)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Cache TTL Ignored, Over-Invalidation Risk
The put method in the authorization decision cache doesn't honor the ttlMillis parameter, always applying the default cache TTL, which limits setting specific expiration times. Additionally, invalidateResource uses startsWith for path matching, potentially causing unintended over-invalidation of cache entries.
Summary of ChangesHello @peterxcli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new, extensible authorization framework for Ozone. It includes interfaces and base classes for pluggable authorizers, a dedicated caching mechanism for authorization decisions, and a comprehensive metrics system to monitor authorization performance and activity. The framework is designed to support modern IAM features and external authorization systems, providing lifecycle management, health checks, and configurable caching with detailed statistics. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a solid framework for pluggable authorization, including caching and metrics. The new interfaces and base classes are well-designed. I've identified a few areas for improvement, mainly concerning correctness in metrics collection, cache implementation details, and a potential resource leak. Please see my detailed comments below.
| if (authorizer instanceof IAccessAuthorizerPlugin) { | ||
| try { | ||
| ((IAccessAuthorizerPlugin) authorizer).start(conf); | ||
| LOG.info("Started authorizer plugin: {}", authorizer.getClass().getName()); | ||
| } catch (Exception e) { | ||
| LOG.error("Failed to start authorizer plugin", e); | ||
| throw new RuntimeException("Failed to start authorizer plugin: " + | ||
| authorizer.getClass().getName(), e); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic correctly handles the startup of IAccessAuthorizerPlugin instances by calling start(). However, there is no corresponding call to stop() (or close()) for these plugins when the OzoneManager shuts down.
The IAccessAuthorizerPlugin interface extends Closeable, which implies that implementations may hold resources (like network connections or thread pools) that must be released. Failing to call stop() will lead to resource leaks.
Please ensure that the OzoneManager's shutdown sequence is updated to call stop() on the authorizer if it is an instance of IAccessAuthorizerPlugin.
| @Override | ||
| public void put(CacheKey key, boolean decision, long ttlMillis) { | ||
| if (cacheEnabled && decisionCache != null) { | ||
| decisionCache.put(key, decision); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The put method implementation ignores the ttlMillis parameter. The cache is configured with a single expireAfterWrite duration for all entries. This contradicts the design of the AuthorizationDecisionCache interface, which is intended to support per-entry TTLs. The interface's Javadoc even suggests using different TTLs for 'allow' and 'deny' decisions, which is a common and useful pattern.
To properly implement per-entry TTL with Guava Cache, you would need to use CacheBuilder.expireAfter(Expiry). Since this implementation doesn't honor the interface contract, it could lead to unexpected caching behavior.
| private final AtomicLong cacheHits = new AtomicLong(0); | ||
| private final AtomicLong cacheMisses = new AtomicLong(0); | ||
| private final AtomicLong cacheEvictions = new AtomicLong(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These AtomicLong counters for cache statistics are redundant. The code is already using AuthorizerMetrics for the Hadoop metrics system, and the Guava cache is configured with .recordStats(), which provides its own, more efficient and reliable statistics via decisionCache.stats().
Using three different ways to track cache stats increases complexity and can lead to inconsistencies.
I recommend removing these AtomicLong fields and refactoring getMetrics() and getStats() to use decisionCache.stats() for hit, miss, and eviction counts. This would also require removing the manual increments in checkAccess().
| .maximumSize(cacheSize) | ||
| .expireAfterWrite(cacheTtlMs, TimeUnit.MILLISECONDS) | ||
| .recordStats() | ||
| .removalListener(notification -> cacheEvictions.incrementAndGet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of the removalListener incorrectly counts all removals as evictions. The cacheEvictions counter is incremented for any removal, including explicit invalidations (invalidate()/invalidateAll()), not just for evictions due to size, expiration, or garbage collection.
To accurately count only evictions, you should check notification.wasEvicted().
.removalListener(notification -> {
if (notification.wasEvicted()) {
cacheEvictions.incrementAndGet();
}
})| decisionCache.asMap().keySet().removeIf( | ||
| key -> key.getResourcePath().startsWith(resourcePath)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache invalidation logic in invalidateResource, invalidateSubject, and invalidateTenant involves iterating over the entire key set of the cache (decisionCache.asMap().keySet().removeIf(...)). This is an O(N) operation, where N is the cache size. For a large cache, this can be slow and may impact performance by blocking other operations, especially if policy or user updates are frequent.
For a more scalable solution, you might consider maintaining secondary indexes to allow for faster lookups of keys to invalidate.
…fecycle of access authorizer plugins. Added closing logic for Closeable authorizers in OzoneManager and improved configuration handling in OzoneAuthorizerFactory.
…ry TTL for authorization decisions. Introduced CachedDecision class to manage expiration and updated cache retrieval logic to invalidate expired entries. Enhanced resource invalidation to handle hierarchical matches.
|
This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days. |
What changes were proposed in this pull request?
Provide a one-liner summary of the changes in the PR Title field above.
It should be in the form of
HDDS-1234. Short summary of the change.Please describe your PR in detail:
perspective not just for the reviewer.
the Jira's description if the jira is well defined.
issue investigation, github discussion, etc.
Examples of well-written pull requests:
What is the link to the Apache JIRA
Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull
request which starts with the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)
(Please replace this section with the link to the Apache JIRA)
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)