Skip to content

KAFKA-3382: Add system test for ReplicationVerificationTool#1160

Closed
SinghAsDev wants to merge 8 commits into
apache:trunkfrom
SinghAsDev:KAFKA-3382
Closed

KAFKA-3382: Add system test for ReplicationVerificationTool#1160
SinghAsDev wants to merge 8 commits into
apache:trunkfrom
SinghAsDev:KAFKA-3382

Conversation

@SinghAsDev
Copy link
Copy Markdown
Contributor

No description provided.

@SinghAsDev
Copy link
Copy Markdown
Contributor Author

@granders one more for you to take a look at.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this comment was copied - I think you mean verify that there is lag in replicas?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@SinghAsDev
Copy link
Copy Markdown
Contributor Author

@granders adopted your suggestion, mind giving it another pass.

@SinghAsDev
Copy link
Copy Markdown
Contributor Author

Test failure is not related, pinging for review @granders :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think getting this to pass is dependent on timing given the way max lag is currently tracked by replica_verification_tool

In order for this to measure as 0, the max lag measurement from kafka.tools.ReplicaVerificationTool cannot output a lag greater than zero before this wait_until call.

For example, I tried inserting a time.sleep(10) just before wait_until, and the test failed:
because there was a momentary lag before the follower caught up, get_max_lag_for_partition was forever at 10, even though according to kafka.tools.ReplicaVerificationTool, the measured lag dropped back to 0.

@SinghAsDev
Copy link
Copy Markdown
Contributor Author

@granders up for your review again. Some of the issues you have seen in past might be due to ReplicaVerificationTool not getting killed properly in previous runs. Fixed that and ran a few times in loop. Hopefully it checks out at your end as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noticed this - I think we want clean_shutdown=False here to ensure we really kill the process if the normal attempt to gracefully shut it down failed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, clean shutdown is not reqd here, I misunderstood the param. Changing this.

@granders
Copy link
Copy Markdown
Contributor

granders commented Apr 5, 2016

Thanks for the updates @SinghAsDev - only a couple minor comments

@granders
Copy link
Copy Markdown
Contributor

@SinghAsDev Thanks for the update - just need to rebase and this should be good

@SinghAsDev
Copy link
Copy Markdown
Contributor Author

Done, thanks for the reviews!

On Fri, Apr 22, 2016 at 9:49 AM, Geoff notifications@github.com wrote:

@SinghAsDev https://github.com/SinghAsDev Thanks for the update - just
need to rebase and this should be good


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1160 (comment)

Regards,
Ashish

@granders
Copy link
Copy Markdown
Contributor

@SinghAsDev You're welcome!
@ewencp Mind taking a final look?

self.partition_lag[topic_partition] = lag

"""
Get latest lag for given topic-partition
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc strings go inside the method being documented

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. fixed. Thanks!

On Fri, Apr 22, 2016 at 3:25 PM, Ewen Cheslack-Postava <
notifications@github.com> wrote:

In tests/kafkatest/services/replica_verification_tool.py
#1160 (comment):

  • def _worker(self, idx, node):
  •    cmd = self.start_cmd(node)
    
  •    self.logger.debug("ReplicaVerificationTool %d command: %s" % (idx, cmd))
    
  •    self.security_config.setup_node(node)
    
  •    for line in node.account.ssh_capture(cmd):
    
  •        self.logger.debug("Parsing line:{}".format(line))
    
  •        parsed = re.search('.*max lag is (.+?) for partition [(.+?)] at', line)
    
  •        if parsed:
    
  •            lag = int(parsed.group(1))
    
  •            topic_partition = parsed.group(2)
    
  •            self.logger.debug("Setting max lag for {} as {}".format(topic_partition, lag))
    
  •            self.partition_lag[topic_partition] = lag
    
  • """
  • Get latest lag for given topic-partition

doc strings go inside the method being documented


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/apache/kafka/pull/1160/files/e86769e2d2e1e730916e6794fd15c6e7fc72ed7c#r60811114

Regards,
Ashish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants