Skip to content

[ServiceBus] Keep connnection alive#12937

Merged
yunhaoling merged 5 commits intoAzure:masterfrom
yunhaoling:keep-conn-alive
Aug 12, 2020
Merged

[ServiceBus] Keep connnection alive#12937
yunhaoling merged 5 commits intoAzure:masterfrom
yunhaoling:keep-conn-alive

Conversation

@yunhaoling
Copy link
Contributor

@yunhaoling yunhaoling commented Aug 7, 2020

The service side has 240 seconds connection idle timeout limitation. In this PR, we turn on the keep_alive feature supported by uamqp to keep the connection active. see issue: #11935

This feature will call connection.do_work() every keep_alive_interval (the default value is 30, which is the value used in EH track1) and send empty frame to the service if uamqp finds it has passed 0.5*remote-idle-timeout since last connection activity.

I don't plan to expose it as a public configurable to users in this preview because:

  1. in .Net and JS, they have keeping connection alive turned on by default, which can't be turned off/adjusted by user.
  2. this leaves us the room in the future on how we want to align the feature.

@yunhaoling
Copy link
Contributor Author

/azp run python - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yunhaoling yunhaoling added the Client This issue points to a problem in the data-plane of the library. label Aug 10, 2020
@yunhaoling
Copy link
Contributor Author

/azp run python - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

self.auth_timeout = kwargs.get("auth_timeout", 60) # type: int
self.encoding = kwargs.get("encoding", "UTF-8")
self.auto_reconnect = kwargs.get("auto_reconnect", True)
self.keep_alive = kwargs.get("keep_alive", 30)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume 30 aligns with cross-language consistency? (just double-checking)

Copy link
Contributor Author

@yunhaoling yunhaoling Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 30 seconds default value is used in eventhub track1.

The working mechanism in uamqp is "every 30 seconds, call connection.do_work(), connection.do_work() would check whether it has passed 0.5*remote idle timeout (240s), if passed, send a empty frame out"

.Net is sending out empty frame like every ~ 50s seconds.
JS is sending out empty frame every 0.5 *remote idle timeout.

I think 30 seconds is a reasonable interval in our case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering how we tell the users to configure this value. 240s is the hard expiry time. So is 220s always better than 200s to keep the connection alive because it does the work with less traffic? If so, a bool value is better than a number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YijunXieMS , right now we don't expose this parameter to users to keep consistent with other languages -- JS and .Net don't allow users to set/tweak this interval, it is turned on by default in their SDK.

If there're customer needs to configure the value/turn on the switch, this could be a post ga feature.

Copy link
Member

@KieranBrantnerMagee KieranBrantnerMagee Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This echoes a question I posed for autolockrenew, and I don't recall if we settled on an answer: Should we disallow setting a keepalive > 240s? Or caveat emptor? I might classify it a "semantic error"; if someone wants to disable keepalive they should show intent and pass None. (Context: Have had at least one user who adjusted a value such as this and unintentionally ran into lock expiry as a result)

Should be precise that this is likely a consideration for whenever we would expose this setting and thus lock it for backcompat, but is something to be mentioned/kept in mind in case there were strong feelings. (I've added it to our discussion list)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as person discussion OOB:

The proposal of flag is preferred as it's simple, users don't need to care about the value.
But we will leave it untouched (turned on by default) until there're customer requests for turning it off.

@yunhaoling
Copy link
Contributor Author

/azp run python - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Client This issue points to a problem in the data-plane of the library. Service Bus

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants