Merge remote-tracking branch 'ak-upstream/trunk' into AK_CCS_Mar_07_22 by soondenana · Pull Request #673 · confluentinc/kafka

Vikas Singh (soondenana) · 2022-03-07T21:18:57Z

Conflict in Jenkinsfile from AK commit: bbb2dc54a0f45bc5455f22a0671adde206dcfa29 from PR: 11833

I have dropped the change as it doesn't pertain to CCS.

All other files merged cleanly.

…#11806) KRaft authorizer should support AclOperation.ALL correctly. Reviewers: Colin P. McCabe <cmccabe@apache.org>

List all the named topologies that have been added to this client Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>

…pache#11810) We should be able to change the topologies while still in the CREATED state. We already allow adding them, but this should include removing them as well Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>

…apache#11807) Use `InetAddress.getHostAddress` in `StandardAuthorizer` instead of `InetAddress.getHostName`. Reviewers: Colin Patrick McCabe <cmccabe@confluent.io>

Add rebalance reason in Kafka Streams. Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>

…pache#7082) Make SetSchemaMetadata SMT ignore records with null value and valueSchema or key and keySchema. The transform has been unit tested for handling null values gracefully while still providing the necessary validation for non-null values. Reviewers: Konstantine Karantasis<konstantine@confluent.io>, Bill Bejeck <bbejeck@apache.org>

…pache#11814) Issue: Imagine a scenario where two threads T1 and T2 are inside UnifiedLog.flush() concurrently: KafkaScheduler thread T1 -> The periodic work calls LogManager.flushDirtyLogs() which in turn calls UnifiedLog.flush(). For example, this can happen due to log.flush.scheduler.interval.ms here. KafkaScheduler thread T2 -> A UnifiedLog.flush() call is triggered asynchronously during segment roll here. Supposing if thread T1 advances the recovery point beyond the flush offset of thread T2, then this could trip the check within LogSegments.values() here for thread T2, when it is called from LocalLog.flush() here. The exception causes the KafkaScheduler thread to die, which is not desirable. Fix: We fix this by ensuring that LocalLog.flush() is immune to the case where the recoveryPoint advances beyond the flush offset. Reviewers: Jun Rao <junrao@gmail.com>

… of PR apache#11787 (apache#11812) This PR addresses the remaining nits from the final review of apache#11787 It also deletes two integration test classes which had only one test in them, and moves the tests to another test class file to save on the time to bring up an entire embedded kafka cluster just for a single run Reviewers: Guozhang Wang <guozhang@confluent.io>, Walker Carlson <wcarlson@confluent.io>

…ogyIntegrationTest tests (apache#11824) Seen a few of the new tests added fail on PR builds lately with "java.lang.AssertionError: Expected all streams instances in [org.apache.kafka.streams.processor.internals.namedtopology.KafkaStreamsNamedTopologyWrapper@7fb3e6b0] to be RUNNING within 30000 ms" We already had some tests using the 30s timeout while others were bumped all the way up to 60s, I figured we should try out a default timeout of 45s and if we still see failures in specific tests we can go from there

Reviewer: Bill Bejeck <bbejeck@apache.org>

…or Streams tests (apache#11823) Pretty much any time we have an integration test failure that's flaky or only exposed when running on Jenkins through the PR builds, it's impossible to debug if it cannot be reproduced locally as the logs attached to the test results have truncated the entire useful part of the logs. This is due to the logs being flooded at the beginning of the test when the Kafka cluster is coming up, eating up all of the allotted characters before we even get to the actual Streams test. Setting log4j.logger.kafka to ERROR greatly improves the situation and cuts down on most of the excessive logging in my local runs. To improve things even more and have some hope of getting the part of the logs we actually need, I also set the loggers for all of the Config objects to ERROR, as these print out the value of every single config (of which there are a lot) and are not useful as we can easily figure out what the configs were if necessary by just inspecting the test locally. Reviewers: Luke Chen <showuon@confluent.io>, Guozhang Wang <guozhang@confluent.io>

Create KafkaConfigSchema to encapsulate the concept of determining the types of configuration keys. This is useful in the controller because we can't import KafkaConfig, which is part of core. Also introduce the TimelineObject class, which is a more generic version of TimelineInteger / TimelineLong. Reviewers: David Arthur <mumrah@gmail.com>

This PR is part of KIP-708 and adds rack aware standby task assignment logic. Reviewer: Bruno Cadonna <cadonna@apache.org>, Luke Chen <showuon@gmail.com>, Vladimir Sitnikov <vladimirsitnikov.apache.org>

…he#11777) * KAFKA-10000: Add producer fencing API to admin client Reviewers: Luke Chen <showuon@gmail.com>, Tom Bentley <tbentley@redhat.com>

…their configs (apache#11572) Implements KIP-769: https://cwiki.apache.org/confluence/display/KAFKA/KIP-769%3A+Connect+APIs+to+list+all+connector+plugins+and+retrieve+their+configuration+definitions Reviewers: Tom Bentley <tbentley@redhat.com>, Chris Egerton <fearthecellos@gmail.com>

Disable idempotence when conflicting config values for acks, retries and max.in.flight.requests.per.connection are set by the user. For the former two configs, we log at info level when we disable idempotence due to conflicting configs. For the latter, we log at warn level since it's due to an implementation detail that is likely to be surprising. This mitigates compatibility impact of enabling idempotence by default. Added unit tests to verify the change in behavior. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>, Mickael Maison <mickael.maison@gmail.com>

…secured (apache#11811) Reviewers: Luke Chen <showuon@confluent.io>, Jun Rao <junrao@gmail.com>

…r.sh (apache#11517) delete unused config batch.size in kafka-console-producer.sh Reviewer: Andrew Eugene Choi <andrew.choi@uwaterloo.ca>, Luke Chen <showuon@gmail.com>,

…e#11839) Reviewers: David Jacot <djacot@confluent.io>

…is in CREATED (apache#11813) Currently the #add/removeNamedTopology APIs behave a little wonky when the application is still in CREATED. Since adding and removing topologies runs some validation steps there is valid reason to want to add or remove a topology on a dummy app that you don't plan to start, or a real app that you haven't started yet. But to actually check the results of the validation you need to call get() on the future, so we need to make sure that get() won't block forever in the case of no failure -- as is currently the case Reviewers: Guozhang Wang <guozhang@confluent.io>, Walker Carlson <wcarlson@confluent.io>

…cessing (apache#11827) This test has been failing somewhat regularly due to going into the ERROR state before reaching RUNNING during the startup phase. The problem is that we are reusing the DELAYED_INPUT_STREAM topics, which had previously been assumed to be uniquely owned by a particular test. We should make sure to delete and re-create these topics for any test that uses them.

…rd fails on brokers. (apache#11830) Reviewers: Guozhang Wang <wangguoz@gmail.com>

Differentiate between unused and unknown configs during log output. Reviewer: Luke Chen <showuon@gmail.com>

…tter balance load between threads (apache#11493) Balance standby and active stateful tasks evenly across threads Reviewer: Luke Chen <showuon@gmail.com>

Reviewers: David Arthur <mumrah@gmail.com>

…nd removing a topology (apache#11847) While debugging the flaky NamedTopologyIntegrationTest. shouldRemoveOneNamedTopologyWhileAnotherContinuesProcessing test, I did discover one real bug. The problem was that we update the TopologyMetadata's builders map (with the known topologies) inside the #removeNamedTopology call directly, whereas the StreamThread may not yet have reached the poll() in the loop and in case of an offset reset, we get an NP.e I changed the NPE to just log a warning for now, going forward I think we should try to tackle some tech debt by keeping the processing tasks and the TopologyMetadata in sync Also includes a quick fix on the side where we were re-adding the topology waiter/KafkaFuture for a thread being shut down Reviewers: Guozhang Wang <guozhang@confluent.io>, Walker Carlson <wcarlson@confluent.io>

Conflict in Jenkinsfile from AK commit: bbb2dc5 , PR: 11833 I have dropped the change as it doesn't pertain to CCS

Ismael Juma (ijuma)

LGTM if Jenkins is green.

Vikas Singh (soondenana) · 2022-03-08T00:54:04Z

After the merge, following 3 tests are failing:

kafka.admin.LeaderElectionCommandTest.[1] Type=Raft, Name=testElectionResultOutput, Security=PLAINTEXT
kafka.api.PlaintextProducerSendTest.testSendWithInvalidCreateTime()
kafka.server.ProduceRequestTest.testProduceWithInvalidTimestamp()

I will take a look at this tomorrow.

Vikas Singh (soondenana) · 2022-03-08T06:20:07Z

We will need to merge again as AK trunk is broken. We need apache#11853, I will wait for that to get merged.

apache#11853) Reviewers: Guozhang Wang <guozhang@confluent.io>, Ricardo Brasil <anribrasil@gmail.com>

Vikas Singh (soondenana) · 2022-03-08T07:27:18Z

Only one test failed after recently pullin AK trunk

kafka.admin.DescribeConsumerGroupTest.testDescribeOffsetsOfExistingGroupWithNoMembers()

Ran it locally and it passes, so going to treat it flaky and merge the PR.

Jason Gustafson (hachikuji) and others added 27 commits February 25, 2022 15:43

KAFKA-13697; KRaft authorizer should support AclOperation.ALL (apache…

2c90447

…#11806) KRaft authorizer should support AclOperation.ALL correctly. Reviewers: Colin P. McCabe <cmccabe@apache.org>

KAFKA-13281: add API to expose current NamedTopology set (apache#11808)

29317e6

List all the named topologies that have been added to this client Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>

KAFKA-13698; KRaft authorizer should use host address instead of name (…

5f91aa7

…apache#11807) Use `InetAddress.getHostAddress` in `StandardAuthorizer` instead of `InetAddress.getHostName`. Reviewers: Colin Patrick McCabe <cmccabe@confluent.io>

KAFKA-13542: add rebalance reason in Kafka Streams (apache#11804)

2ccc834

Add rebalance reason in Kafka Streams. Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>

MINOR: Improve test assertions for IQv2 (apache#11828)

7172f35

Reviewer: Bill Bejeck <bbejeck@apache.org>

KAFKA-6718 / Rack aware standby task assignor (apache#10851)

62e6466

This PR is part of KIP-708 and adds rack aware standby task assignment logic. Reviewer: Bruno Cadonna <cadonna@apache.org>, Luke Chen <showuon@gmail.com>, Vladimir Sitnikov <vladimirsitnikov.apache.org>

KAFKA-10000: Add producer fencing API to admin client (KIP-618) (apac…

066cdc8

…he#11777) * KAFKA-10000: Add producer fencing API to admin client Reviewers: Luke Chen <showuon@gmail.com>, Tom Bentley <tbentley@redhat.com>

(docs) Add JavaDocs for org.apache.kafka.common.security.oauthbearer.…

f5d8fb2

…secured (apache#11811) Reviewers: Luke Chen <showuon@confluent.io>, Jun Rao <junrao@gmail.com>

KAFKA-13466: delete unused config batch.size in kafka-console-produce…

ae76b9d

…r.sh (apache#11517) delete unused config batch.size in kafka-console-producer.sh Reviewer: Andrew Eugene Choi <andrew.choi@uwaterloo.ca>, Luke Chen <showuon@gmail.com>,

KAFKA-13706: Remove closed connections from MockSelector.ready (apach…

95dbba9

…e#11839) Reviewers: David Jacot <djacot@confluent.io>

KAFKA-13694: Log more specific information when the verification reco…

3be9784

…rd fails on brokers. (apache#11830) Reviewers: Guozhang Wang <wangguoz@gmail.com>

KAFKA-13689: printing unused and unknown logs separately (apache#11800)

0dac4b4

Differentiate between unused and unknown configs during log output. Reviewer: Luke Chen <showuon@gmail.com>

KAFKA-12959: Distribute standby and active tasks across threads to be…

e3ef29e

…tter balance load between threads (apache#11493) Balance standby and active stateful tasks evenly across threads Reviewer: Luke Chen <showuon@gmail.com>

KAFKA-13671: Add ppc64le build stage (apache#11833)

bbb2dc5

Reviewers: David Arthur <mumrah@gmail.com>

Merge remote-tracking branch 'ak-upstream/trunk' into AK_CCS_Mar_07_22

5df91f3

Conflict in Jenkinsfile from AK commit: bbb2dc5 , PR: 11833 I have dropped the change as it doesn't pertain to CCS

Vikas Singh (soondenana) requested a review from Ismael Juma (ijuma) March 7, 2022 21:19

Ismael Juma (ijuma) approved these changes Mar 7, 2022

View reviewed changes

Luke Chen (showuon) and others added 2 commits March 8, 2022 14:28

KAFKA-13710: bring the InvalidTimestampException back for record error (

1848f04

apache#11853) Reviewers: Guozhang Wang <guozhang@confluent.io>, Ricardo Brasil <anribrasil@gmail.com>

Merge remote-tracking branch 'ak-upstream/trunk' into AK_CCS_Mar_07_22

2ed563c

Vikas Singh (soondenana) merged commit 2ed563c into confluentinc:master Mar 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge remote-tracking branch 'ak-upstream/trunk' into AK_CCS_Mar_07_22#673

Merge remote-tracking branch 'ak-upstream/trunk' into AK_CCS_Mar_07_22#673
Vikas Singh (soondenana) merged 29 commits intoconfluentinc:masterfrom
soondenana:AK_CCS_Mar_07_22

Vikas Singh (soondenana) commented Mar 7, 2022

Uh oh!

Ismael Juma (ijuma) left a comment

Uh oh!

Vikas Singh (soondenana) commented Mar 8, 2022

Uh oh!

Vikas Singh (soondenana) commented Mar 8, 2022

Uh oh!

Vikas Singh (soondenana) commented Mar 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants

Conversation

Vikas Singh (soondenana) commented Mar 7, 2022

Uh oh!

Ismael Juma (ijuma) left a comment

Choose a reason for hiding this comment

Uh oh!

Vikas Singh (soondenana) commented Mar 8, 2022

Uh oh!

Vikas Singh (soondenana) commented Mar 8, 2022

Uh oh!

Vikas Singh (soondenana) commented Mar 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

19 participants