Discussion:
Connection takes too long in dual-stack if IPv6 starting family is not available or blocked and has multiple addresses.
Dmitry Karpov via curl-library
2021-06-03 21:01:46 UTC
Permalink
Hello,

Trying dual-stack, I noticed that if there are multiple IPv6 addresses for the host, but IPv6 is blocked somewhere (i.e. firewall in the router, VPN etc) then it takes too long to establish connection via available IPv4 addresses.

I debugged this issue and noticed that it happens because libCurl doesn't take into account socket select() errors and tries all addresses for the failing family before starting the next IPv4 family after the Happy Eyeballs timeout. This makes dual-stack connection phase a several times longer than if IPv4 resolution is forced.

The problem can be mitigated if libCurl detects socket select() errors hinting that IPv6 is not available when trying the first IPv6 address, and switches immediately to IPv4 family thus skipping over the failing IPv6.

I tried this approach with the following change in the Curl_is_connected() function (lib/connect.c, line 851):

diff --git a/lib/connect.c b/lib/connect.c
index d9317f378..9ab4d4d50 100644
--- a/lib/connect.c
+++ b/lib/connect.c
@@ -993,11 +993,25 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
ipaddress, conn->port,
Curl_strerror(error, buffer, sizeof(buffer)));
#endif
-
- conn->timeoutms_per_addr[i] = conn->tempaddr[i]->ai_next == NULL ?
- allow : allow / 2;
- ainext(conn, i, TRUE);
- status = trynextip(data, conn, sockindex, i);
+ /* If we have a socket error on select for starting family then it
+ * most likely means that the starting family is not available, and we
+ * can skip the rest of the failing family addresses and try the next
+ * family to avoid waiting for too long.*/
+ if(i == 0 && rc & CURL_CSELECT_ERR) {
+ /* close the current family socket, if open */
+ if (conn->tempsock[i] != CURL_SOCKET_BAD) {
+ Curl_closesocket(conn, conn->tempsock[i]);
+ conn->tempsock[i] = CURL_SOCKET_BAD;
+ }
+ error = 0;
+ status = trynextip(conn, sockindex, 1);
+ } else {
+ conn->timeoutms_per_addr[i] = conn->tempaddr[i]->ai_next == NULL ?
+ allow : allow / 2;
+ ainext(conn, i, TRUE);
+ status = trynextip(conn, sockindex, i);
+ }
+
if((status != CURLE_COULDNT_CONNECT) ||
conn->tempsock[other] == CURL_SOCKET_BAD)
/* the last attempt failed and no other sockets remain open */


I am not sure if this is the right fix, but it helped to improve the connection phase time for such cases very dramatically and make it close to pure IPv4 connection times.
My tests also didn't reveal any bad side effects of this change.

Thanks,
Dmitry Karpov
Daniel Stenberg via curl-library
2021-06-04 13:35:18 UTC
Permalink
Post by Dmitry Karpov via curl-library
I am not sure if this is the right fix, but it helped to improve the
connection phase time for such cases very dramatically and make it close to
pure IPv4 connection times. My tests also didn't reveal any bad side effects
of this change.
Thanks for this!

From my understanding of this issue, this problem happens because
Curl_is_connected() gets called after a while but not because of a timeout on
the socket but because of an error.

This makes 'rc' not contain zero so the conditional check if it should start
the other family for connecting doesn't run but instead it kicks off another
connect attempt to the next IPv6 address.

If this is correct, shouldn't we then be able to fix the issue by making sure
that we check the happy-eyeballs-timeout both for timeouts and for errors? To
me, that seems like a simpler take than your proposal.

I mean like this:

diff --git a/lib/connect.c b/lib/connect.c
index d9317f378..2c209fa56 100644
--- a/lib/connect.c
+++ b/lib/connect.c
@@ -931,18 +931,10 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
conn->timeoutms_per_addr[i]) {
infof(data, "After %" CURL_FORMAT_TIMEDIFF_T
"ms connect time, move on!\n", conn->timeoutms_per_addr[i]);
error = ETIMEDOUT;
}
-
- /* should we try another protocol family? */
- if(i == 0 && !conn->bits.parallel_connect &&
- (Curl_timediff(now, conn->connecttime) >=
- data->set.happy_eyeballs_timeout)) {
- conn->bits.parallel_connect = TRUE; /* starting now */
- trynextip(data, conn, sockindex, 1);
- }
}
else if(rc == CURL_CSELECT_OUT || conn->bits.tcp_fastopen) {
if(verifyconnect(conn->tempsock[i], &error)) {
/* we are connected with TCP, awesome! */

@@ -1002,10 +994,20 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
conn->tempsock[other] == CURL_SOCKET_BAD)
/* the last attempt failed and no other sockets remain open */
result = status;
}
}
+
+ if(!rc || error) {
+ /* should we try another protocol family? */
+ if(i == 0 && !conn->bits.parallel_connect &&
+ (Curl_timediff(now, conn->connecttime) >=
+ data->set.happy_eyeballs_timeout)) {
+ conn->bits.parallel_connect = TRUE; /* starting now */
+ trynextip(data, conn, sockindex, 1);
+ }
+ }
}

if(result &&
(conn->tempsock[0] == CURL_SOCKET_BAD) &&
(conn->tempsock[1] == CURL_SOCKET_BAD)) {
--
/ daniel.haxx.se
| Commercial curl support up to 24x7 is available!
| Private help, bug fixes, support, ports, new features
| https://www.wolfssl.com/contact/
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://c
Daniel Stenberg via curl-library
2021-06-07 11:36:00 UTC
Permalink
On Fri, 4 Jun 2021, Dmitry Karpov wrote:

(I've put the CC back to the libcurl list to allow everyone to participate)
Initially, I also had the same idea to use happy eyeballs timeout to cover
cases of connection socket errors, but with host resolutions having multiple
IPv6 addresses, it was really a waste of time to try next addresses of the
failed family knowing that they would definitely fail. I think that in the
current state, when Curl tries family addresses sequentially, happy eyeball
timeout should be applied only for "good" connections that don't issue
socket errors, and socket errors should be probably treated as a "family
failure".
Hm, what if we also consider the first *error* on IPv6 as a signal to kick off
the IPv4 connect in addition to the timeout? See updated patch below.
If Curl tried parallel connections for all addresses inside the family
instead of doing it sequentially, then the idea to apply happy eyeballs
timeout to the socket errors would work very well. Parallel connections
would allow to fail the family much faster if something was wrong with it,
and this would also be beneficial because the client would get the fastest
connection inside the family - which is nice.
Yes, and I'm not against trying this out - this idea has been mentioned before
in HTTP implementor "circles". I would however imagine that it would need to
be capped at some value because some names will return MANY addresses so this
could be rather excessive, which could become a problem for apps who fire up
many transfers at once.

--- patch proposal v2

diff --git a/lib/connect.c b/lib/connect.c
index d9317f378..76e9d50dd 100644
--- a/lib/connect.c
+++ b/lib/connect.c
@@ -931,18 +931,10 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
conn->timeoutms_per_addr[i]) {
infof(data, "After %" CURL_FORMAT_TIMEDIFF_T
"ms connect time, move on!\n", conn->timeoutms_per_addr[i]);
error = ETIMEDOUT;
}
-
- /* should we try another protocol family? */
- if(i == 0 && !conn->bits.parallel_connect &&
- (Curl_timediff(now, conn->connecttime) >=
- data->set.happy_eyeballs_timeout)) {
- conn->bits.parallel_connect = TRUE; /* starting now */
- trynextip(data, conn, sockindex, 1);
- }
}
else if(rc == CURL_CSELECT_OUT || conn->bits.tcp_fastopen) {
if(verifyconnect(conn->tempsock[i], &error)) {
/* we are connected with TCP, awesome! */

@@ -1002,10 +994,20 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
conn->tempsock[other] == CURL_SOCKET_BAD)
/* the last attempt failed and no other sockets remain open */
result = status;
}
}
+
+ if(!rc || error) {
+ /* should we try another protocol family? */
+ if(i == 0 && !conn->bits.parallel_connect &&
+ (error || (Curl_timediff(now, conn->connecttime) >=
+ data->set.happy_eyeballs_timeout))) {
+ conn->bits.parallel_connect = TRUE; /* starting now */
+ trynextip(data, conn, sockindex, 1);
+ }
+ }
}

if(result &&
(conn->tempsock[0] == CURL_SOCKET_BAD) &&
(conn->tempsock[1] == CURL_SOCKET_BAD)) {
--
/ daniel.haxx.se
-------------------------------------------------------------------
Unsubscribe: https://cool.haxx.se/list/listinfo/curl-library
Etiquette: https://curl.se/mail/etiquette
Loading...