Merge remote-tracking branch 'ak-upstream/trunk' into AK_CCS_Mar_07_22#673
Merged
Vikas Singh (soondenana) merged 29 commits intoconfluentinc:masterfrom Mar 8, 2022
Merged
Conversation
…#11806) KRaft authorizer should support AclOperation.ALL correctly. Reviewers: Colin P. McCabe <cmccabe@apache.org>
List all the named topologies that have been added to this client Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
…pache#11810) We should be able to change the topologies while still in the CREATED state. We already allow adding them, but this should include removing them as well Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
…apache#11807) Use `InetAddress.getHostAddress` in `StandardAuthorizer` instead of `InetAddress.getHostName`. Reviewers: Colin Patrick McCabe <cmccabe@confluent.io>
Add rebalance reason in Kafka Streams. Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>
…pache#7082) Make SetSchemaMetadata SMT ignore records with null value and valueSchema or key and keySchema. The transform has been unit tested for handling null values gracefully while still providing the necessary validation for non-null values. Reviewers: Konstantine Karantasis<konstantine@confluent.io>, Bill Bejeck <bbejeck@apache.org>
…pache#11814) Issue: Imagine a scenario where two threads T1 and T2 are inside UnifiedLog.flush() concurrently: KafkaScheduler thread T1 -> The periodic work calls LogManager.flushDirtyLogs() which in turn calls UnifiedLog.flush(). For example, this can happen due to log.flush.scheduler.interval.ms here. KafkaScheduler thread T2 -> A UnifiedLog.flush() call is triggered asynchronously during segment roll here. Supposing if thread T1 advances the recovery point beyond the flush offset of thread T2, then this could trip the check within LogSegments.values() here for thread T2, when it is called from LocalLog.flush() here. The exception causes the KafkaScheduler thread to die, which is not desirable. Fix: We fix this by ensuring that LocalLog.flush() is immune to the case where the recoveryPoint advances beyond the flush offset. Reviewers: Jun Rao <junrao@gmail.com>
… of PR apache#11787 (apache#11812) This PR addresses the remaining nits from the final review of apache#11787 It also deletes two integration test classes which had only one test in them, and moves the tests to another test class file to save on the time to bring up an entire embedded kafka cluster just for a single run Reviewers: Guozhang Wang <guozhang@confluent.io>, Walker Carlson <wcarlson@confluent.io>
…ogyIntegrationTest tests (apache#11824) Seen a few of the new tests added fail on PR builds lately with "java.lang.AssertionError: Expected all streams instances in [org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper@7fb3e6b0] to be RUNNING within 30000 ms" We already had some tests using the 30s timeout while others were bumped all the way up to 60s, I figured we should try out a default timeout of 45s and if we still see failures in specific tests we can go from there
Reviewer: Bill Bejeck <bbejeck@apache.org>
…or Streams tests (apache#11823) Pretty much any time we have an integration test failure that's flaky or only exposed when running on Jenkins through the PR builds, it's impossible to debug if it cannot be reproduced locally as the logs attached to the test results have truncated the entire useful part of the logs. This is due to the logs being flooded at the beginning of the test when the Kafka cluster is coming up, eating up all of the allotted characters before we even get to the actual Streams test. Setting log4j.logger.kafka to ERROR greatly improves the situation and cuts down on most of the excessive logging in my local runs. To improve things even more and have some hope of getting the part of the logs we actually need, I also set the loggers for all of the Config objects to ERROR, as these print out the value of every single config (of which there are a lot) and are not useful as we can easily figure out what the configs were if necessary by just inspecting the test locally. Reviewers: Luke Chen <showuon@confluent.io>, Guozhang Wang <guozhang@confluent.io>
Create KafkaConfigSchema to encapsulate the concept of determining the types of configuration keys. This is useful in the controller because we can't import KafkaConfig, which is part of core. Also introduce the TimelineObject class, which is a more generic version of TimelineInteger / TimelineLong. Reviewers: David Arthur <mumrah@gmail.com>
This PR is part of KIP-708 and adds rack aware standby task assignment logic. Reviewer: Bruno Cadonna <cadonna@apache.org>, Luke Chen <showuon@gmail.com>, Vladimir Sitnikov <vladimirsitnikov.apache.org>
…he#11777) * KAFKA-10000: Add producer fencing API to admin client Reviewers: Luke Chen <showuon@gmail.com>, Tom Bentley <tbentley@redhat.com>
…their configs (apache#11572) Implements KIP-769: https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions Reviewers: Tom Bentley <tbentley@redhat.com>, Chris Egerton <fearthecellos@gmail.com>
Disable idempotence when conflicting config values for acks, retries and max.in.flight.requests.per.connection are set by the user. For the former two configs, we log at info level when we disable idempotence due to conflicting configs. For the latter, we log at warn level since it's due to an implementation detail that is likely to be surprising. This mitigates compatibility impact of enabling idempotence by default. Added unit tests to verify the change in behavior. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Mickael Maison <mickael.maison@gmail.com>
…secured (apache#11811) Reviewers: Luke Chen <showuon@confluent.io>, Jun Rao <junrao@gmail.com>
…r.sh (apache#11517) delete unused config batch.size in kafka-console-producer.sh Reviewer: Andrew Eugene Choi <andrew.choi@uwaterloo.ca>, Luke Chen <showuon@gmail.com>,
…e#11839) Reviewers: David Jacot <djacot@confluent.io>
…is in CREATED (apache#11813) Currently the #add/removeNamedTopology APIs behave a little wonky when the application is still in CREATED. Since adding and removing topologies runs some validation steps there is valid reason to want to add or remove a topology on a dummy app that you don't plan to start, or a real app that you haven't started yet. But to actually check the results of the validation you need to call get() on the future, so we need to make sure that get() won't block forever in the case of no failure -- as is currently the case Reviewers: Guozhang Wang <guozhang@confluent.io>, Walker Carlson <wcarlson@confluent.io>
…cessing (apache#11827) This test has been failing somewhat regularly due to going into the ERROR state before reaching RUNNING during the startup phase. The problem is that we are reusing the DELAYED_INPUT_STREAM topics, which had previously been assumed to be uniquely owned by a particular test. We should make sure to delete and re-create these topics for any test that uses them.
…rd fails on brokers. (apache#11830) Reviewers: Guozhang Wang <wangguoz@gmail.com>
Differentiate between unused and unknown configs during log output. Reviewer: Luke Chen <showuon@gmail.com>
…tter balance load between threads (apache#11493) Balance standby and active stateful tasks evenly across threads Reviewer: Luke Chen <showuon@gmail.com>
Reviewers: David Arthur <mumrah@gmail.com>
…nd removing a topology (apache#11847) While debugging the flaky NamedTopologyIntegrationTest. shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing test, I did discover one real bug. The problem was that we update the TopologyMetadata's builders map (with the known topologies) inside the #removeNamedTopology call directly, whereas the StreamThread may not yet have reached the poll() in the loop and in case of an offset reset, we get an NP.e I changed the NPE to just log a warning for now, going forward I think we should try to tackle some tech debt by keeping the processing tasks and the TopologyMetadata in sync Also includes a quick fix on the side where we were re-adding the topology waiter/KafkaFuture for a thread being shut down Reviewers: Guozhang Wang <guozhang@confluent.io>, Walker Carlson <wcarlson@confluent.io>
Conflict in Jenkinsfile from AK commit: bbb2dc5 , PR: 11833 I have dropped the change as it doesn't pertain to CCS
Ismael Juma (ijuma)
approved these changes
Mar 7, 2022
Member
Ismael Juma (ijuma)
left a comment
There was a problem hiding this comment.
LGTM if Jenkins is green.
Member
Author
|
After the merge, following 3 tests are failing: I will take a look at this tomorrow. |
Member
Author
|
We will need to merge again as AK trunk is broken. We need apache#11853, I will wait for that to get merged. |
apache#11853) Reviewers: Guozhang Wang <guozhang@confluent.io>, Ricardo Brasil <anribrasil@gmail.com>
Member
Author
|
Only one test failed after recently pullin AK trunk
Ran it locally and it passes, so going to treat it flaky and merge the PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Conflict in Jenkinsfile from AK commit: bbb2dc54a0f45bc5455f22a0671adde206dcfa29 from PR: 11833
I have dropped the change as it doesn't pertain to CCS.
All other files merged cleanly.