Skip to content

KAFKA-8557: system tests - add support for (optional) interbroker listener with the same security protocol as client listeners#6938

Merged
rajinisivaram merged 13 commits intoapache:trunkfrom
stan-is-hate:interbroker-listener
Jun 21, 2019
Merged

KAFKA-8557: system tests - add support for (optional) interbroker listener with the same security protocol as client listeners#6938
rajinisivaram merged 13 commits intoapache:trunkfrom
stan-is-hate:interbroker-listener

Conversation

@stan-is-hate
Copy link
Copy Markdown
Contributor

@stan-is-hate stan-is-hate commented Jun 13, 2019

This PR adds an option to kafka.py to start a separate interbroker listener using the same security protocol as one of the predefined listeners in 'port_mappings' dict.
This is a first baby-step to properly support named listeners in system tests.

Kafka currently supports named listeners, where you can have two or more listeners with the same security protocol but different names. Current KafkaService implementation, however, wouldn't allow that, since listeners in port_mappings are keyed by security_protocol, so there's 1-1 relationship. Clients use bootstrap_servers method which also accepts security_protocol.

This PR adds a parameter use_separate_interbroker_listener to KafkaService constructor as well as adds a setup_interbroker_listener() method, which, if provided/invoked, will enable a separate listener in port_mappings map, keyed by INTERBROKER_LISTENER_NAME. This way client implementations don't need to change and can still pick a port based on security protocol, but underneath there will be an additional port open used exclusively for broker-to-broker communication. This opens up additional scenarios for testing.

This change should be backwards compatible - default value for use_separate_interbroker_listener is False, and I tried to make sure that behavior of KafkaService doesn't change when it's False. Even when it is True, it should not affect behavior of clients, which, as described above, would continue to pick ports based on security_protocol field.

I intentionally didn't change behavior more than this. We'll work on adding wider support for named listeners, but its better to make smaller incremental changes rather than trying to boil the ocean.

Testing:
Updated security_rolling_upgrade_test to include scenarios when separate interbroker listener is present. Ran these subset of these scenarios locally (didn't run all security_protocol permutations, just some of them) using ducker-ak.
Ran updated security_rolling_upgrade_test in branch builder, all passed:
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2713/

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@stan-is-hate
Copy link
Copy Markdown
Contributor Author

@ijuma ijuma requested review from omkreddy and rajinisivaram June 15, 2019 15:21
Copy link
Copy Markdown
Contributor

@rajinisivaram rajinisivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stan-confluent Thanks for the PR. Left a few comments. I dont think this change requires more than one or atmost two additional tests. There seems to be a large number of new test combinations which would add a lot of time to a full test run.

Comment thread tests/kafkatest/services/kafka/kafka.py Outdated
authorizer_class_name=None, topics=None, version=DEV_BRANCH, jmx_object_names=None,
jmx_attributes=None, zk_connect_timeout=5000, zk_session_timeout=6000, server_prop_overides=None, zk_chroot=None):
jmx_attributes=None, zk_connect_timeout=5000, zk_session_timeout=6000, server_prop_overides=None, zk_chroot=None,
use_separate_interbroker_listener=False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we considered making this a bit more flexible? For example, would it be better to allow tests to specify client_listener_name and interbroker_listener_name rather than a flag. There is a lot of tech debt here since this code was written before listener names were introduced. I was wondering if this could be an opportunity to clear some of that. If you have already considered that and found it a lot of work, then we merge this and consider that later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, all good points. I originally intended to do a bigger change, encapsulating all listener-related stuff (security protocol, port, sasl config) in a separate Listener class, and instead of pre-defined dict with ports we could actually add and remove listener objects to/from the dict. But that's a massive and not particularly backwards compatible change.
As for 'client_listener_name' and 'interbroker_listener_name' - I originally wrote a PR like that (only for interbroker one though, client change, again, is much bigger and harder to keep backwards compatible) - but went for boolean for practical reasons, since in the tests that we envision so far we don't care about listener name, just the fact that there are two listeners on the same security protocol - so effectively I was just calling constructor with interbroker_listener_name=KafkaService.INTERBROKER_LISTENER_NAME.
I can send a follow-up PR to enable custom name interbroker listener or update this one, whichever you prefer. I would suggest making changes to client listeners in separate PRs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played a bit with 'interbroker_listener_name' and its tricky to marry two concepts - existing one and custom named one, without having quite a messy code. I'd suggest keeping this PR as is, with boolean value, since the tests are passing and send a follow up with custom names - this way I can unblock several people who depend on separate interbroker listener, but don't care about custom names. Wdyt?

self.set_authorizer_and_bounce(security_protocol, security_protocol)

@cluster(num_nodes=9)
@matrix(broker_protocol=[SecurityConfig.SSL, SecurityConfig.SASL_PLAINTEXT, SecurityConfig.SASL_SSL],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need all these combinations of tests? Also, it is unrelated to upgrade, so could be in a different test class?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I don't think we need all combinations here - we're interested in the following cases:

  • client and broker on the different protocols
  • client and broker on the same protocol, different sasl mechanisms
  • client and broker on the same protocol, same sasl mechanism.
    I'll update the PR with this change, please let me know if I'm missing something here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the tests as you suggested, moving out interbroker listener tests into a separate test class and reducing their number.


@cluster(num_nodes=9)
@matrix(broker_protocol=[SecurityConfig.SSL, SecurityConfig.SASL_PLAINTEXT, SecurityConfig.SASL_SSL],
client_protocol=[SecurityConfig.SSL, SecurityConfig.SASL_PLAINTEXT, SecurityConfig.SASL_SSL])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not sure why we have all thse combinations, and why the test is with upgrade tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will move it out.

@rajinisivaram
Copy link
Copy Markdown
Contributor

@stan-confluent Also, since this change can affect several tests, can you start a system test run on your branch in the system test branch builder? If you have already done that, can you include the results here? Thanks.

@stan-is-hate
Copy link
Copy Markdown
Contributor Author

stan-is-hate commented Jun 18, 2019

This is a full system test run on the original PR, as you can see there's one test failure directly related to my changes. I'm investigating it and will update the PR with the fix if any.
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2719
I've also triggered a run with updated PR version and will share the results here.

@stan-is-hate stan-is-hate changed the title Kafkatest - add support for (optional) interbroker listener with the same security protocol as client listeners KAFKA-8557 - system tests - add support for (optional) interbroker listener with the same security protocol as client listeners Jun 18, 2019
@stan-is-hate stan-is-hate changed the title KAFKA-8557 - system tests - add support for (optional) interbroker listener with the same security protocol as client listeners KAFKA-8557: system tests - add support for (optional) interbroker listener with the same security protocol as client listeners Jun 18, 2019
@stan-is-hate
Copy link
Copy Markdown
Contributor Author

@rajinisivaram I cannot reproduce the issue on this PR anymore. Might have been fixed when I moved the tests out. I pushed a change to specifically test what was broken (client on SSL, broker on SASL_SSL) and it passes. Rerunning that specific test on jenkins now.

Copy link
Copy Markdown
Contributor

@rajinisivaram rajinisivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stan-confluent Thanks for the update. Left another comment about reducing the number of tests for this change. Apart from that looks good.


# run produce/consume/validate loop while enabling a separate interbroker listener via rolling restart
self.run_produce_consume_validate(
self.roll_in_interbroker_listener, broker_protocol, broker_sasl_mechanism, True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there value in an upgrade test for this? Listener names are internal to the broker and as long as security protocols match, everything should just work. That is the reason why we didn't need to add upgrade tests when listener names were added all those releases ago. I think it is useful to have a test with listener names, but a single test, rather than all these combinations - there are six tests, each with rolling restarts in this file, adding a lot of time to the system tests. I would suggest limiting to one combination for each test and without rolling restart. We could include this in tests/kafkatest/tests/core/security_test.py if it was just one test, but keeping it separate as it is now is fine too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rajinisivaram I was going to argue that we do need to cover a use case where two listeners have the same security protocol since this is new functionality for this pr, but then this bit is covered in newly added test_rolling_upgrade_phase_two_separate_interbroker_listener() and two test_rolling_upgrade_sasl_mechanism... methods in security_rolling_upgrade_test.py so I might get rid of this file altogether - I can just add a single case for removing a separate listener with rolling restart (I think there is some value in this, more for testing KafkaService itself, to ensure it closes the port correctly) and that's it. Wdyt?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rajinisivaram per your advice, reduced a number of tests to two and both live in security_rolling_upgrade_test now.

Comment thread tests/kafkatest/services/kafka/kafka.py Outdated
Comment thread tests/kafkatest/services/kafka/templates/kafka.properties Outdated
…I missed initially and Brian pointed out. Simplified tests a bit more. Got rid of automatic \

opening of interbroker port in the setup method since it was inconsistent with how client port behaves
@stan-is-hate
Copy link
Copy Markdown
Contributor Author

stan-is-hate commented Jun 20, 2019

@brianbushree @rajinisivaram - I pushed another small but important update, here's what's changed:

  • Brian pointed out something I missed - we should really be using listener.name.{{name}}.sasl.enabled.mechanisms instead of single sasl.enabled.mechanisms which apply to all listeners - changed in kafka.properties
  • removed the line which was automatically opening interbroker port in setup_interbroker_listener() - it was causing issues (like leaving dangling PLAINTEXT listener which was marked open in constructor) and was inconsistent with how client listener works. Instead it is being open in start() automatically or explicitly before bouncing nodes
  • simplified tests some more, per Rajini's advice
    Ran some tests locally, all passed; triggered system test run on jenkins with security rolling upgrade test.

@stan-is-hate
Copy link
Copy Markdown
Contributor Author

The latest update leads to two failing tests, investigating

@stan-is-hate
Copy link
Copy Markdown
Contributor Author

aaaand rolled back the change where we have per listener sasl.enabled.mechanisms. Its still the right thing to do long term, but it breaks couple of tests that rely on a shared list of sasl mechanisms specifically. Until we support multiple sasl mechanisms per listener in KafkaService and SecurityConfig (which is doable, but quite a big change), we should keep a shared list.

@stan-is-hate
Copy link
Copy Markdown
Contributor Author

Test result for security_rolling_upgrade_test.py on the latest version (spoiler - all passed):
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2732/

Copy link
Copy Markdown
Contributor

@rajinisivaram rajinisivaram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stan-confluent Thanks for the updates, LGTM

@rajinisivaram
Copy link
Copy Markdown
Contributor

@stan-confluent Thanks for the PR, merging to trunk

@rajinisivaram rajinisivaram merged commit f51d7d3 into apache:trunk Jun 21, 2019
rajinisivaram pushed a commit to confluentinc/kafka that referenced this pull request Jun 21, 2019
…tener with the same security protocol as client listeners (apache#6938)

Reviewers: Brian Bushree <bbushree@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
@stan-is-hate stan-is-hate mentioned this pull request Jun 26, 2019
3 tasks
hachikuji pushed a commit that referenced this pull request May 27, 2020
…tener with the same security protocol as client listeners (#6938)

Reviewers: Brian Bushree <bbushree@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants