Skip to content

Conversation

@dragonls
Copy link
Contributor

@dragonls dragonls commented Apr 11, 2023

Fixes #20066

Motivation

Fixes the bug that producers/consumers will all disconnect while using tenant admin to produce/consume.

The root cause is that the permission check logic is not the same bewteen org.apache.pulsar.broker.service.ServerCnx and org.apache.pulsar.broker.service.persistent.PersistentTopic#onPoliciesUpdate.

In org.apache.pulsar.broker.service.ServerCnx, while handleProducer and handleSubscribe, the permission check will go to isTopicOperationAllowed and finally be processed by org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync, which will validateTenantAdminAccess. Everything is fine.

In org.apache.pulsar.broker.service.persistent.PersistentTopic#onPoliciesUpdate, it checks the permission by producer.checkPermissionsAsync and consumer.checkPermissionsAsync. Let's take the processing logic of the producer as an example. It will be processed by org.apache.pulsar.broker.authorization.AuthorizationService#canProduceAsync and finally org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#canProduceAsync. The tenant admin can not pass the check.

Modifications

Update org.apache.pulsar.broker.authorization.AuthorizationProvider#canProduceAsync to org.apache.pulsar.broker.authorization.AuthorizationProvider#allowTopicOperationAsync in org.apache.pulsar.broker.authorization.AuthorizationService#canProduceAsync. So do org.apache.pulsar.broker.authorization.AuthorizationService#canConsumeAsync and org.apache.pulsar.broker.authorization.AuthorizationService#canLookupAsync

Verifying this change

  • Make sure that the change passes the CI checks.

This change added tests and can be verified as follows:

  • Added and fixed the test in org.apache.pulsar.broker.auth.AuthorizationTest#simple
  • Added and fixed the test in org.apache.pulsar.websocket.proxy.ProxyAuthorizationTest

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: dragonls#7

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 11, 2023
@dragonls dragonls force-pushed the master-fix-tenant-admin-auth branch from 093e668 to 78d265f Compare April 11, 2023 12:48
@dragonls
Copy link
Contributor Author

dragonls commented Apr 12, 2023

In fact, there are many places in the current code that check the permission of super user or tenant admin:

  1. org.apache.pulsar.broker.authorization.AuthorizationProvider#isSuperUser in org.apache.pulsar.broker.authorization.AuthorizationService, including functions canProduceAsync, canConsumeAsync, canLookupAsync. No tenant admin check here.
  2. org.apache.pulsar.broker.authorization.AuthorizationProvider#isSuperUserOrAdmin in org.apache.pulsar.broker.authorization.AuthorizationService, including function allowSinkOpsAsync, allowSourceOpsAsync, allowFunctionOpsAsync.
  3. After this PR, org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#validateTenantAdminAccess in org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider, including functions canProduceAsync, canConsumeAsync. No tenant admin check in allowTheSpecifiedActionOpsAsync.

Maybe we need to uniformly add/update permission check wherever we need. And also remove some reduplicated check, for example:

  1. validateTenantAdminAccess in org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync
  2. isSuperUser in org.apache.pulsar.broker.authorization.AuthorizationService, including functions canProduceAsync, canConsumeAsync, canLookupAsync.

I am just afaid that there are still some authorization bug not found.

What do you think? @Technoboy- @nodece

@nodece
Copy link
Member

nodece commented Apr 12, 2023

I know these cases, I want to deprecate the canProduceAsync, canConsumeAsync, and canLookupAsync, use the allowTopicOperationAsync instead.

@dragonls
Copy link
Contributor Author

dragonls commented Apr 12, 2023

I know these cases, I want to deprecate the canProduceAsync, canConsumeAsync, and canLookupAsync, use the allowTopicOperationAsync instead.

Sounds good. I can do this after this PR merged.

@Technoboy-
Copy link
Contributor

I know these cases, I want to deprecate the canProduceAsync, canConsumeAsync, and canLookupAsync, use the allowTopicOperationAsync instead.

Sounds good. I can do this after this PR merged.

Yes, this will keep the logic the same. But we'd better discuss it in the dev mail list first.

@dragonls
Copy link
Contributor Author

Make sense. I also find some function in org.apache.pulsar.broker.authorization.AuthorizationService marked deprecated.
image

Maybe we can do the same thing.

@nodece
Copy link
Member

nodece commented Apr 12, 2023

Thank @dragonls for your contribution, I made #19461 to that before.

@dragonls
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

codecov-commenter commented Apr 12, 2023

Codecov Report

Merging #20068 (44bf9fe) into master (dc1cdac) will increase coverage by 48.70%.
The diff coverage is 100.00%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #20068       +/-   ##
=============================================
+ Coverage     24.26%   72.96%   +48.69%     
- Complexity      294    31823    +31529     
=============================================
  Files          1609     1868      +259     
  Lines        125669   138332    +12663     
  Branches      13707    15220     +1513     
=============================================
+ Hits          30490   100930    +70440     
+ Misses        90689    29390    -61299     
- Partials       4490     8012     +3522     
Flag Coverage Δ
inttests 24.13% <0.00%> (-0.13%) ⬇️
systests 24.78% <0.00%> (?)
unittests 72.25% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...sar/broker/authorization/AuthorizationService.java 59.75% <100.00%> (+54.55%) ⬆️

... and 1590 files with indirect coverage changes

Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to change the canProduceAsync and canConsumeAsync methods. Instead, we can change the calling code. Let me know what you think.

Copy link
Member

@nodece nodece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect the test to fail. You also need to modify the org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync to avoid a dead loop.

@dragonls, Sorry for providing an incorrect review.

@dragonls
Copy link
Contributor Author

I expect the test to fail. You also need to modify the org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync to avoid a dead loop.

I didn't find dead loop. Can you be more specific?

@nodece
Copy link
Member

nodece commented Apr 17, 2023

I expect the test to fail. You also need to modify the org.apache.pulsar.broker.authorization.PulsarAuthorizationProvider#allowTopicOperationAsync to avoid a dead loop.

I didn't find dead loop. Can you be more specific?

Sorry that I provide an incorrect review.

Copy link
Member

@nodece nodece left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nodece nodece requested a review from Technoboy- April 18, 2023 09:36
@nodece
Copy link
Member

nodece commented Apr 18, 2023

/pulsarbot rerun-failure-checks

@nodece
Copy link
Member

nodece commented Apr 19, 2023

Thank @dragonls for your contribution!

@nodece nodece merged commit fc17c1d into apache:master Apr 19, 2023
@nodece nodece added this to the 3.1.0 milestone Apr 19, 2023
michaeljmarshall added a commit to michaeljmarshall/pulsar that referenced this pull request Apr 19, 2023
@michaeljmarshall
Copy link
Member

@nodece @dragonls - I don't think we should have taken this direction with the PR. I propose an alternative here: #20142.

michaeljmarshall added a commit to michaeljmarshall/pulsar that referenced this pull request Apr 19, 2023
michaeljmarshall added a commit to michaeljmarshall/pulsar that referenced this pull request Apr 19, 2023
michaeljmarshall added a commit that referenced this pull request Apr 20, 2023
…20142)

Fixes: #20066

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, this change helps us move in the right direction.

### Modifications

* Update `Producer` and `Consumer` in broker to call the `AuthorizationService#allowTopicOperationAsync` method.

### Verifying this change

This change is trivial.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.
michaeljmarshall added a commit that referenced this pull request Apr 20, 2023
…#20143)

This reverts commit fc17c1d.

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think this approach could have unintended consequences. Instead, I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method. I propose an update to the `Consumer` and `Producer` logic here #20142.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, I think we should not change their implementations.

### Modifications

* Revert #20068

### Verifying this change

This change is trivial. It removes certain test changes that were only made to make the previous PR work.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.
@dragonls
Copy link
Contributor Author

dragonls commented Apr 20, 2023

@nodece @dragonls - I don't think we should have taken this direction with the PR. I propose an alternative here: #20142.

OK. I looked into all relative discussion, the right direction seems to be:

  1. All superUser/admin check should be moved into AuthorizationService, which keep AuthorizationProvider clean and more focused on the inner check.
  2. Outer callers, such as Producer/Consumer, shoud call AuthorizationService.allowTopicOperationAsync instead of canProduceAsync/canConsumerAsync/canLookupAsync. The canProduceAsync/canConsumerAsync/canLookupAsync in AuthorizationProvider might be deprecated in the future.

@dragonls dragonls deleted the master-fix-tenant-admin-auth branch April 20, 2023 08:56
@poorbarcode
Copy link
Contributor

poorbarcode commented May 7, 2023

poorbarcode pushed a commit that referenced this pull request May 7, 2023
Co-authored-by: druidliu <druidliu@tencent.com>
(cherry picked from commit fc17c1d)
poorbarcode pushed a commit that referenced this pull request May 7, 2023
…20142)

Fixes: #20066

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, this change helps us move in the right direction.

### Modifications

* Update `Producer` and `Consumer` in broker to call the `AuthorizationService#allowTopicOperationAsync` method.

### Verifying this change

This change is trivial.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.

(cherry picked from commit dc5e497)
poorbarcode pushed a commit that referenced this pull request May 7, 2023
…#20143)

This reverts commit fc17c1d.

### Motivation

In #20068 we changed the way that the `AuthorizationService` is implemented. I think this approach could have unintended consequences. Instead, I think we should change the `Consumer` and the `Producer` logic to call the correct `AuthorizationService` method. I propose an update to the `Consumer` and `Producer` logic here #20142.

Given that our goal is to deprecate the `AuthorizationService` methods for `canProduce` and `canConsume`, I think we should not change their implementations.

### Modifications

* Revert #20068

### Verifying this change

This change is trivial. It removes certain test changes that were only made to make the previous PR work.

### Documentation

- [x] `doc-not-needed`

### Matching PR in forked repository

PR in forked repository: Skipping PR as I ran tests locally.

(cherry picked from commit 00dc7a0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [broker] Producer and consumer will all be disconnected while using tenant admin to produce/consume

7 participants