Dmitry Karpov via curl-library
2021-06-03 21:01:46 UTC
Hello,
Trying dual-stack, I noticed that if there are multiple IPv6 addresses for the host, but IPv6 is blocked somewhere (i.e. firewall in the router, VPN etc) then it takes too long to establish connection via available IPv4 addresses.
I debugged this issue and noticed that it happens because libCurl doesn't take into account socket select() errors and tries all addresses for the failing family before starting the next IPv4 family after the Happy Eyeballs timeout. This makes dual-stack connection phase a several times longer than if IPv4 resolution is forced.
The problem can be mitigated if libCurl detects socket select() errors hinting that IPv6 is not available when trying the first IPv6 address, and switches immediately to IPv4 family thus skipping over the failing IPv6.
I tried this approach with the following change in the Curl_is_connected() function (lib/connect.c, line 851):
diff --git a/lib/connect.c b/lib/connect.c
index d9317f378..9ab4d4d50 100644
--- a/lib/connect.c
+++ b/lib/connect.c
@@ -993,11 +993,25 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
ipaddress, conn->port,
Curl_strerror(error, buffer, sizeof(buffer)));
#endif
-
- conn->timeoutms_per_addr[i] = conn->tempaddr[i]->ai_next == NULL ?
- allow : allow / 2;
- ainext(conn, i, TRUE);
- status = trynextip(data, conn, sockindex, i);
+ /* If we have a socket error on select for starting family then it
+ * most likely means that the starting family is not available, and we
+ * can skip the rest of the failing family addresses and try the next
+ * family to avoid waiting for too long.*/
+ if(i == 0 && rc & CURL_CSELECT_ERR) {
+ /* close the current family socket, if open */
+ if (conn->tempsock[i] != CURL_SOCKET_BAD) {
+ Curl_closesocket(conn, conn->tempsock[i]);
+ conn->tempsock[i] = CURL_SOCKET_BAD;
+ }
+ error = 0;
+ status = trynextip(conn, sockindex, 1);
+ } else {
+ conn->timeoutms_per_addr[i] = conn->tempaddr[i]->ai_next == NULL ?
+ allow : allow / 2;
+ ainext(conn, i, TRUE);
+ status = trynextip(conn, sockindex, i);
+ }
+
if((status != CURLE_COULDNT_CONNECT) ||
conn->tempsock[other] == CURL_SOCKET_BAD)
/* the last attempt failed and no other sockets remain open */
I am not sure if this is the right fix, but it helped to improve the connection phase time for such cases very dramatically and make it close to pure IPv4 connection times.
My tests also didn't reveal any bad side effects of this change.
Thanks,
Dmitry Karpov
Trying dual-stack, I noticed that if there are multiple IPv6 addresses for the host, but IPv6 is blocked somewhere (i.e. firewall in the router, VPN etc) then it takes too long to establish connection via available IPv4 addresses.
I debugged this issue and noticed that it happens because libCurl doesn't take into account socket select() errors and tries all addresses for the failing family before starting the next IPv4 family after the Happy Eyeballs timeout. This makes dual-stack connection phase a several times longer than if IPv4 resolution is forced.
The problem can be mitigated if libCurl detects socket select() errors hinting that IPv6 is not available when trying the first IPv6 address, and switches immediately to IPv4 family thus skipping over the failing IPv6.
I tried this approach with the following change in the Curl_is_connected() function (lib/connect.c, line 851):
diff --git a/lib/connect.c b/lib/connect.c
index d9317f378..9ab4d4d50 100644
--- a/lib/connect.c
+++ b/lib/connect.c
@@ -993,11 +993,25 @@ CURLcode Curl_is_connected(struct Curl_easy *data,
ipaddress, conn->port,
Curl_strerror(error, buffer, sizeof(buffer)));
#endif
-
- conn->timeoutms_per_addr[i] = conn->tempaddr[i]->ai_next == NULL ?
- allow : allow / 2;
- ainext(conn, i, TRUE);
- status = trynextip(data, conn, sockindex, i);
+ /* If we have a socket error on select for starting family then it
+ * most likely means that the starting family is not available, and we
+ * can skip the rest of the failing family addresses and try the next
+ * family to avoid waiting for too long.*/
+ if(i == 0 && rc & CURL_CSELECT_ERR) {
+ /* close the current family socket, if open */
+ if (conn->tempsock[i] != CURL_SOCKET_BAD) {
+ Curl_closesocket(conn, conn->tempsock[i]);
+ conn->tempsock[i] = CURL_SOCKET_BAD;
+ }
+ error = 0;
+ status = trynextip(conn, sockindex, 1);
+ } else {
+ conn->timeoutms_per_addr[i] = conn->tempaddr[i]->ai_next == NULL ?
+ allow : allow / 2;
+ ainext(conn, i, TRUE);
+ status = trynextip(conn, sockindex, i);
+ }
+
if((status != CURLE_COULDNT_CONNECT) ||
conn->tempsock[other] == CURL_SOCKET_BAD)
/* the last attempt failed and no other sockets remain open */
I am not sure if this is the right fix, but it helped to improve the connection phase time for such cases very dramatically and make it close to pure IPv4 connection times.
My tests also didn't reveal any bad side effects of this change.
Thanks,
Dmitry Karpov