Skip to content

2.3 release.py sftp retries#7450

Closed
mumrah wants to merge 2 commits intoapache:2.3from
mumrah:2.3-sftp-retries
Closed

2.3 release.py sftp retries#7450
mumrah wants to merge 2 commits intoapache:2.3from
mumrah:2.3-sftp-retries

Conversation

@mumrah
Copy link
Copy Markdown
Member

@mumrah mumrah commented Oct 4, 2019

Add retry capability to the cmd function in release.py. This allows for selectively retrying certain commands which might be flaky, like the SFTP puts.

These fail often for me due to SSH connection loss
@mumrah
Copy link
Copy Markdown
Member Author

mumrah commented Oct 5, 2019

Trying this for RC1 of 2.3.1, saw this during most recent run:

Uploading artifacts in /Users/david.arthur/Code/Confluent/prod/kafka/.release_work_dir/kafka-2.3.1-rc1/javadoc/org/apache/kafka/streams/kstream to your Apache home directory [u'sftp', u'-b', u'-', u'davidarthur@home.apache.org'] --> 
put /Users/david.arthur/Code/Confluent/prod/kafka/.release_work_dir/kafka-2.3.1-rc1/javadoc/org/apache/kafka/streams/kstream/package-frame.html public_html/kafka-2.3.1-rc1/javadoc/org/apache/kafka/streams/kstream/package-frame.html

> ssh: connect to host home.apache.org port 22: No route to host
> Connection closed
> 
Retrying... 2 remaining retries
Uploading artifacts in /Users/david.arthur/Code/Confluent/prod/kafka/.release_work_dir/kafka-2.3.1-rc1/javadoc/org/apache/kafka/streams/kstream to your Apache home directory [u'sftp', u'-b', u'-', u'davidarthur@home.apache.org'] 
> sftp> 
> sftp> put /Users/david.arthur/Code/Confluent/prod/kafka/.release_work_dir/kafka-2.3.1-rc1/javadoc/org/apache/kafka/streams/kstream/package-frame.html public_html/kafka-2.3.1-rc1/javadoc/org/apache/kafka/streams/kstream/package-frame.html
> 

and the script continues on.

@mumrah mumrah requested review from ewencp and mjsax October 5, 2019 01:52
Copy link
Copy Markdown
Contributor

@ewencp ewencp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few minor questions.

Comment thread release.py
cmd_arg = cmd_arg.split()

allow_failure = kwargs.pop("allow_failure", False)
num_retries = kwargs.pop("num_retries", 0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kinds of failures are we trying to mask here? Frequently retries also require delays between retries.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a sleep in the new PR

Comment thread release.py
kwargs['num_retries'] = num_retries - 1
kwargs['allow_failure'] = allow_failure
print("Retrying... %d remaining retries" % (num_retries - 1))
return cmd(action, cmd_arg, *args, **kwargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine currently given max number of retries we specify, but would really be preferable to do this iteratively (which also doesn't require copying/mgmt of kwargs like this).

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jan 30, 2020

@mumrah you may want to get this into trunk before the upcoming release.

@mumrah
Copy link
Copy Markdown
Member Author

mumrah commented Jan 30, 2020

Closing in favor of #8021 which is based on trunk

@mumrah mumrah closed this Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants