Skip to content

Conversation

@mgodave
Copy link
Contributor

@mgodave mgodave commented Mar 7, 2018

Thread.sleep is rarely the correct answer as it can lead to race conditions and incorrect results. In this case sleeping the tread will tie up an entire thread in the REST endpoint. The operations that are being slept are themselves async and can easily be represented as Futures (CompletableFutures). Furthermore the REST framework we are using supports async results. This has the benefit of still allowing timeouts on the endpoint, allowing an operation to fail or complete, and not tie up an entire serving thread while an already async operation completes. There is still one remaining Thread.sleep that I would like to squash but I wanted to put this work up for consideration.

@merlimat
Copy link
Contributor

merlimat commented Mar 7, 2018

@mgodave This change doesn't solve the problem that the ugly Thread.sleep() was introduced as a workaround for.

The problem is not the sync vs async ZK call (though sure, the async with response continuation is more efficient).

The real problem is that in ZK there is no read-your-write consistency by default. That is also aggravated by the fact the we have a cache for configuration/metadata that is updated with ZK watches.

The particular instance of the problem typically manifest itself only in a test environment where there are multiple brokers and multiple ZK servers in the ensemble.

It goes like this:

  • Create a partitioned topics --> create a z-node with the partitions info
  • Try to use the partitioned topic immediately after
    1. The request to check the partitioned topic metadata can go to a different broker
    2. That broker is connected to a different ZK server
    3. That ZK server is few steps behind the ZK quorum
    4. The partitioned topic metadata is not there in ZK

The "good" solution would be to always do a ZK sync() before reading the partitioned topic metadata. That would ensure that we can actually read our previous write.

The problem with that is that we have to do a ZK write (implied by sync()) and that we can never cache the info in broker.

@mgodave
Copy link
Contributor Author

mgodave commented Mar 7, 2018

The point is, sleeping a thread in the REST endpoint is a resource hog and generally not a great idea.

@mgodave
Copy link
Contributor Author

mgodave commented Mar 7, 2018

I think I see what you are saying. Either way, scheduling a return after the sync timeout would be a better option.

@merlimat
Copy link
Contributor

merlimat commented Mar 7, 2018

The point is, sleeping a thread in the REST endpoint is a resource hog and generally not a great idea.

I'm not debating that :)

I think I see what you are saying. Either way, scheduling a return after the sync timeout would be a better option.

Definitely. It's still hacky but at least we don't keep the thread hanging.

@mgodave
Copy link
Contributor Author

mgodave commented Mar 7, 2018 via email

@mgodave mgodave closed this Mar 7, 2018
@ivankelly
Copy link
Contributor

@merlimat @mgodave

The real problem is that in ZK there is no read-your-write consistency by default.

There is read-your-writes within a session. However, which zookeeper client cache thing seems to be able to handle some errors by creating a new client. And it's caching data, only updating with watchers. There's no guarantee that when your create succeeds the watchers will have run. Quite the opposite in fact, as the server will respond with success for the write first and then trigger the watchers.

I think the sleep is trying to solve a different problem though. It's not to ensure that the local client has seen update, but to try to ensure that all brokers have seen it, which is perfectly natural in distsys.

@mgodave mgodave deleted the v2-thread-sleep branch March 14, 2018 20:16
sijie added a commit to sijie/pulsar that referenced this pull request Aug 8, 2020
*Motivation*

We should use the original role to verify if it is allowed for a given topic operation

(cherry picked from commit 6734ab6)
sijie added a commit to sijie/pulsar that referenced this pull request Aug 10, 2020
*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test
wolfstudy pushed a commit that referenced this pull request Aug 12, 2020
…#1355) (#7788)

* Fix allowTopicOperationAsync logic (#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior
wolfstudy pushed a commit that referenced this pull request Aug 13, 2020
…#1355) (#7788)

* Fix allowTopicOperationAsync logic (#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior

(cherry picked from commit 48f5a2f)
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
…apache#1355) (apache#7788)

* Fix allowTopicOperationAsync logic (apache#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior
jerrypeng pushed a commit to jerrypeng/incubator-pulsar that referenced this pull request Aug 24, 2020
…apache#1355) (apache#7788)

* Fix allowTopicOperationAsync logic (apache#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior
lbenc135 pushed a commit to lbenc135/pulsar that referenced this pull request Sep 5, 2020
…apache#1355) (apache#7788)

* Fix allowTopicOperationAsync logic (apache#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior
lbenc135 pushed a commit to lbenc135/pulsar that referenced this pull request Sep 5, 2020
…apache#1355) (apache#7788)

* Fix allowTopicOperationAsync logic (apache#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior
lbenc135 pushed a commit to lbenc135/pulsar that referenced this pull request Sep 5, 2020
…apache#1355) (apache#7788)

* Fix allowTopicOperationAsync logic (apache#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior
abhilashmandaliya pushed a commit to ashishshinde/pulsar that referenced this pull request Nov 19, 2020
…apache#1355) (apache#7788)

* Fix allowTopicOperationAsync logic (apache#1355)

*Modifications*

- We should use the original role to verify if it is allowed for a given topic operation
- use the original authentication data
- Authz provider doesn't have to be aware of proxyRole
- Fix authorization test

* Refactor authorize logic to provide a uniform authorization behavior

(cherry picked from commit 48f5a2f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants