quiche: implement quic_port_utils#6488
Conversation
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
|
/assign @wu-bin |
|
/retest |
|
🙀 Error while processing event: |
| ) | ||
|
|
||
| envoy_cc_test_library( | ||
| name = "quic_platform_port_utils_impl_lib", |
There was a problem hiding this comment.
Is it possible to move this library to //test? That way the source code in this library will be excluded from the coverage ci.
There was a problem hiding this comment.
Its header file has to be in the same directory as other impl files unless we want to config the gen rule to change the #include path in its API file which doesn't seem nice to me.
| const int rc = ::bind(fd, reinterpret_cast<const sockaddr*>(&ip_.ipv4_.address_), | ||
| sizeof(ip_.ipv4_.address_)); | ||
| return {rc, errno}; | ||
| auto& os_syscalls = Api::OsSysCallsSingleton::get(); |
There was a problem hiding this comment.
Is this change related to this PR? Or is it a drive-by fix?
There was a problem hiding this comment.
it's adding a mock-able layer of system calls for testing.
| return port; | ||
| } | ||
| } | ||
| RELEASE_ASSERT(false, "Failed to pick a port for test."); |
There was a problem hiding this comment.
How about just ASSERT? I think in dbg mode we also want it to die if it can't pick a unused port.
There was a problem hiding this comment.
RELEASE_ASSERT die in both case, but ASSERT doesn't die in opt mode.
There was a problem hiding this comment.
Oh, I got it reversed, sorry.
| // Only try to get a port for limited times. | ||
| std::vector<Envoy::Network::Address::IpVersion> supported_versions = | ||
| Envoy::TestEnvironment::getIpVersionsForTest(); | ||
| for (size_t i = 0; i < 30000; ++i) { |
There was a problem hiding this comment.
30000 iterations looks very high:) I think O(10) is enough.
There was a problem hiding this comment.
In other platforms, we use similar large numbers.
There was a problem hiding this comment.
I'd go smaller here as well if for no other reason than I trust CircleCi less than blaze and chrome's test frameworks :-/
| // Fail bind call's to mimic port exhaustion. | ||
| EXPECT_CALL(os_sys_calls, bind(_, _, _)) | ||
| .WillRepeatedly(Return(Envoy::Api::SysCallIntResult{-1, EADDRINUSE})); | ||
| EXPECT_DEATH(QuicPickUnusedPortOrDie(), "Failed to pick a port for test."); |
There was a problem hiding this comment.
In #6393 I've changed
- QuicPlatformTest from TEST to TEST_F.
- EXPECT_DEATH to EXPECT_DEATH_LOG_TO_STDERR.
Please sync and do the same for the new tests.
|
/retest |
|
🔨 rebuilding |
|
SslIpVersionsClientType/GrpcSslClientIntegrationTest.BasicSslRequestWithClientCert/IPv4_EnvoyGrpc failed coverage test. It shouldn't be related to this PR. |
|
/retest |
|
🔨 rebuilding |
|
/retest |
|
🔨 rebuilding |
alyssawilk
left a comment
There was a problem hiding this comment.
@htuch can you glance at the bzl changes?
| // Only try to get a port for limited times. | ||
| std::vector<Envoy::Network::Address::IpVersion> supported_versions = | ||
| Envoy::TestEnvironment::getIpVersionsForTest(); | ||
| for (size_t i = 0; i < 30000; ++i) { |
There was a problem hiding this comment.
I'd go smaller here as well if for no other reason than I trust CircleCi less than blaze and chrome's test frameworks :-/
| break; | ||
| } | ||
| if (port == 0) { | ||
| // Just get a port from findOrCheckFreePort(), check its usability in the rest address |
There was a problem hiding this comment.
rest -> other
(or rest of)
| for (size_t i = 0; i < 30000; ++i) { | ||
| uint32_t port = 0; | ||
| bool available = true; | ||
| for (auto ip_version : supported_versions) { |
There was a problem hiding this comment.
I should probably know this offhand, but do we need to loop here? If we're listening on ipv6 inaddr any, that covers ipv4 over mapped addresses but I'm not sure if listening on 127.0.0.1:444 and then ::444 would fail on the platforms we support.
There was a problem hiding this comment.
but I'm not sure if listening on 127.0.0.1:444 and then ::444 would fail on the platforms we support.
What do you mean?
The returned port of this function should be able to bind to both an v4 and v6 address (in separate test) if test supports both address families. So here we need check if we can create a socket and bind it to an address in both v4 and v6.
There was a problem hiding this comment.
I replaced the version loop with checking on single version according to our discussion. Thanks for the suggestion!
|
FWIW if your coverage fail was gRPC I think that's fixed at HEAD, so a master merge may help |
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
One last comment request, otherwise LGTM
| std::vector<Envoy::Network::Address::IpVersion> supported_versions = | ||
| Envoy::TestEnvironment::getIpVersionsForTest(); | ||
| ASSERT(!supported_versions.empty()); | ||
| Envoy::Network::Address::IpVersion ip_version = supported_versions.size() == 1 |
There was a problem hiding this comment.
Maybe a comment as to why this works?
There was a problem hiding this comment.
I commented a few lines below inside for loop.
There was a problem hiding this comment.
Fair point. I think it would be more useful here, if folks are reading top down though.
Signed-off-by: Dan Zhang <danzh@google.com>
htuch
left a comment
There was a problem hiding this comment.
Build changes look good, but just a comment on the port picking.
| if (addr_port != nullptr && addr_port->ip() != nullptr) { | ||
| // Find a port. | ||
| return addr_port->ip()->port(); | ||
| } |
There was a problem hiding this comment.
@curiouserrandy has also written code like this, so might have comments.
There was a problem hiding this comment.
This looks like a lot of hassle to do something the OS will do for you (i.e. pick a port if you bind port 0), and additionally racy as well. Can you give me a sense as to why you're doing it this way as opposed to using bind(..., 0) at point of use?
I also share Harvey's concern about mixing apparently production and apparently test code.
There was a problem hiding this comment.
see my reply to Harvey's comment
There was a problem hiding this comment.
Ah, gotcha. Pity :-}. Next question: As I read the code (including findOrCheckFreePort) what is being done is that the address passed is being bound with port == 0, the port returned by the kernel is being returned, and the bound socket is being released. Why is that wrapped in a "try 300 times" loop? If a bind with port 0 fails, I'd think there's something badly wrong somewhere, and retrying wouldn't help.
(Meta-comment: I'm still in the "asking questions to understand the code" phase of the review, so if you and Harvey get to closure, don't block on me.)
There was a problem hiding this comment.
Ah, gotcha. Pity :-}. Next question: As I read the code (including findOrCheckFreePort) what is being done is that the address passed is being bound with port == 0, the port returned by the kernel is being returned, and the bound socket is being released. Why is that wrapped in a "try 300 times" loop? If a bind with port 0 fails, I'd think there's something badly wrong somewhere, and retrying wouldn't help.
Would temporary running out of port lead to bind failure on port 0? Or some other transient system failure?
There was a problem hiding this comment.
Yeah, I think the loop made sense when we were doing v4/v6 checks but now that we've done away with that I suspect we can kill the loop off entirely.
It could be transient but it's functional equivalent to sleep(1)-and-try-again so I think we can skip
|
|
||
| namespace quic { | ||
|
|
||
| int QuicPickUnusedPortOrDieImpl() { |
There was a problem hiding this comment.
This is test-only, right? I would ideally like to see a better naming or structuring to indicate code is for test vs. prod., since there are different requirements here as far as error handling and robustness. Are we restricted by the QUICHE layout?
Also, why does QUICHE do this? In Envoy, we just bind zero on the server side and then communicate the effective port OOB to the client in integration tests. Could QUICHE do this?
There was a problem hiding this comment.
This is test-only, right? I would ideally like to see a better naming or structuring to indicate code is for test vs. prod., since there are different requirements here as far as error handling and robustness. Are we restricted by the QUICHE layout?
Talked with Harvey about a reasonable approach, such test only code can be moved to /test, and be included by a dummy impl.h file in /source.
Also, why does QUICHE do this? In Envoy, we just bind zero on the server side and then communicate the effective port OOB to the client in integration tests. Could QUICHE do this?
Quic e2e tests doesn't have this server->client channel to communicate the port it picks after starting listening.
Signed-off-by: Dan Zhang <danzh@google.com>
|
Moved the definitions to //test/ PTAL |
Signed-off-by: Dan Zhang <danzh@google.com>
|
/retest |
|
🔨 rebuilding |
| @@ -0,0 +1,10 @@ | |||
|
|
|||
| // If it supports both v4 and v6, checking availability under v6 with IPV6_V6ONLY | ||
| // set to false is sufficient because such socket can be used on v4-mapped | ||
| // v6 address. | ||
| Envoy::Network::Address::IpVersion ip_version = supported_versions.size() == 1 |
| // consumed or referenced directly by other Envoy code. It serves purely as a | ||
| // porting layer for QUICHE. | ||
|
|
||
| #include "test/extensions/quic_listeners/quiche/platform/quic_port_utils_test_impl.h" |
There was a problem hiding this comment.
Can you add a comment explaining this redirect for posterity? Thanks
Signed-off-by: Dan Zhang <danzh@google.com>
|
Thanks @danzh2010, this is good to merge once CI passes. |
|
/retest |
|
🔨 rebuilding |
Implement QuicPickUnusedPortOrDie() and QuicRecyclePort() backed by Envoy::Network::Test::findOrCheckFreePort().
Added
include_prefixto envoy_cc_test_library argument list and pass it down. So that quiche test only impl's can be included by api header files with relative #include path.Risk Level: low, not in use
Testing: added new tests in test/extensions/quic_listeners/quiche/platform/quic_platform_test.cc
Part of #2557