Skip to content

KAFKA-8304: Fix registration of Connect REST extensions#6651

Merged
rhauch merged 6 commits intoapache:trunkfrom
C0urante:kafka-8304
May 7, 2019
Merged

KAFKA-8304: Fix registration of Connect REST extensions#6651
rhauch merged 6 commits intoapache:trunkfrom
C0urante:kafka-8304

Conversation

@C0urante
Copy link
Copy Markdown
Contributor

@C0urante C0urante commented Apr 30, 2019

Jira

Currently, a REST extension can cause deadlock if it requests the list of connectors from its ConnectRestExtensionContext.clusterState() in its ConnectRestExtension.register(...) method. This is because the ConnectClusterState implementation is backed by a HerderProvider that, at that time, has no associated Herder instance, and since that Herder is given to the HerderProvider later by the same thread, deadlock occurs until the call to HerderProvider.get() made by the ConnectClusterStateImpl times out. At this point, startup fails and the Connect worker dies.

The changes here separate RestServer.start(...) into two separate methods. The first, RestServer.initializeServer(), starts the Jetty server and binds to a port, which ensure the accuracy of the RestServer.advertisedUrl() method that is used later on by both the ConnectStandalone and ConnectDistributed classes to determine the worker ID. The second, RestServer.initializeResources(Herder herder) actually creates the Connect REST resources (RootResource, ConnectorsResource, etc.) and registers all REST extensions.

Since these changes make HerderProvider obsolete and it is not part of any public API, that interface is also removed.

This approach ensures that the Connect REST interface is started only when all of its REST extensions have been successfully registered, which is important for security use cases where request filters are installed and parts of the Connect REST API are subject to authentication/authorization.

Between the call to RestServer.intializeServer() and RestServer.initializeResources(...), any requests made to the worker will result in a 404 response. This is slightly less desirable than the current behavior, which is to block on requests until the herder is up and running, but shouldn't be too much of an issue.

A new integration test is added that verifies that a call to ConnectRestExtensionContext.clusterState().connectors() succeeds. This test fails on the current trunk branch, and succeeds with the changes involved in this PR.

These changes should be backported through to 2.0.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@C0urante
Copy link
Copy Markdown
Contributor Author

@wicknicks @mageshn this touches on code that you've both done non-trivial work on (integration test harness changes and REST extensions), I'd be grateful if you could take a look and let me know what you think :)

Copy link
Copy Markdown
Contributor

@wicknicks wicknicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @C0urante for the fixes! some comments.

@wicknicks
Copy link
Copy Markdown
Contributor

@C0urante: I'm guessing you want to backport this fix, what version do you think is most appropriate? I think we can go back all the way to 2.0. If so, let's add that to the description for the committer. Thanks!

@C0urante
Copy link
Copy Markdown
Contributor Author

C0urante commented May 2, 2019

@wicknicks thanks for the review. I've implemented many of your suggestions and left a comment on any that I'd like to discuss further before including here. Ready for the next round when you have time!

Copy link
Copy Markdown
Contributor

@wicknicks wicknicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there. last couple of comments.

@C0urante
Copy link
Copy Markdown
Contributor Author

C0urante commented May 3, 2019

Thanks @wicknicks, I agree with your suggestions and have implemented them accordingly. Ready for the next round

Copy link
Copy Markdown
Contributor

@wicknicks wicknicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment.

@C0urante
Copy link
Copy Markdown
Contributor Author

C0urante commented May 3, 2019

Thanks @wicknicks, I've made the alteration you suggested and this is ready for the next round

Copy link
Copy Markdown
Contributor

@wicknicks wicknicks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work, @C0urante!

@C0urante
Copy link
Copy Markdown
Contributor Author

C0urante commented May 6, 2019

Thanks @wicknicks!

@rhauch could you please take a look when you have a chance?

Copy link
Copy Markdown
Contributor

@rhauch rhauch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work, @C0urante!

@rhauch rhauch merged commit cc097e9 into apache:trunk May 7, 2019
rhauch pushed a commit that referenced this pull request May 7, 2019
Fix registration of Connect REST extensions to prevent deadlocks when extensions get the list of connectors before the herder is available. Added integration test to check the behavior.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
rhauch pushed a commit that referenced this pull request May 7, 2019
Fix registration of Connect REST extensions to prevent deadlocks when extensions get the list of connectors before the herder is available. Added integration test to check the behavior.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
rhauch pushed a commit that referenced this pull request May 7, 2019
Fix registration of Connect REST extensions to prevent deadlocks when extensions get the list of connectors before the herder is available. Added integration test to check the behavior.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
@C0urante C0urante deleted the kafka-8304 branch May 8, 2019 00:24
ijuma added a commit to ijuma/kafka that referenced this pull request May 8, 2019
…s-hashcode

* apache-github/trunk:
  KAFKA-8158: Add EntityType for Kafka RPC fields (apache#6503)
  MINOR: correctly parse version OffsetCommitResponse version < 3
  KAFKA-8284: enable static membership on KStream (apache#6673)
  KAFKA-8304: Fix registration of Connect REST extensions (apache#6651)
  KAFKA-8275; Take throttling into account when choosing least loaded node (apache#6619)
  KAFKA-3522: Interactive Queries must return timestamped stores (apache#6661)
  MINOR: MetricsIntegrationTest should set StreamsConfig.STATE_DIR_CONFIG (apache#6687)
  MINOR: Remove unused field in `ListenerConnectionQuota`
  KAFKA-8131; Move --version implementation into CommandLineUtils (apache#6481)
  KAFKA-8056; Use automatic RPC generation for FindCoordinator (apache#6408)
  MINOR: Remove workarounds for lz4-java bug affecting byte buffers (apache#6679)
  KAFKA-7455: Support JmxTool to connect to a secured RMI port. (apache#5968)
  MINOR: Document improvement (apache#6682)
  MINOR: Fix ThrottledReplicaListValidator doc error. (apache#6537)
  KAFKA-8306; Initialize log end offset accurately when start offset is non-zero (apache#6652)
mapr-devops pushed a commit to mapr/kafka that referenced this pull request May 10, 2019
Corrects the system tests to check for either a 404 or a 409 error and sleeping until the Connect REST API becomes available. This corrects a previous change to how REST extensions are initialized (apache#6651), which added the ability of Connect throwing a 404 if the resources are not yet started. The integration tests were already looking for 409.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
rhauch pushed a commit that referenced this pull request May 10, 2019
Corrects the system tests to check for either a 404 or a 409 error and sleeping until the Connect REST API becomes available. This corrects a previous change to how REST extensions are initialized (#6651), which added the ability of Connect throwing a 404 if the resources are not yet started. The integration tests were already looking for 409.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
rhauch pushed a commit that referenced this pull request May 10, 2019
Corrects the system tests to check for either a 404 or a 409 error and sleeping until the Connect REST API becomes available. This corrects a previous change to how REST extensions are initialized (#6651), which added the ability of Connect throwing a 404 if the resources are not yet started. The integration tests were already looking for 409.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
rhauch pushed a commit that referenced this pull request May 10, 2019
Corrects the system tests to check for either a 404 or a 409 error and sleeping until the Connect REST API becomes available. This corrects a previous change to how REST extensions are initialized (#6651), which added the ability of Connect throwing a 404 if the resources are not yet started. The integration tests were already looking for 409.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
Fix registration of Connect REST extensions to prevent deadlocks when extensions get the list of connectors before the herder is available. Added integration test to check the behavior.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
Corrects the system tests to check for either a 404 or a 409 error and sleeping until the Connect REST API becomes available. This corrects a previous change to how REST extensions are initialized (apache#6651), which added the ability of Connect throwing a 404 if the resources are not yet started. The integration tests were already looking for 409.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants