Discussion:
Randomly no response from IIS, but *only* with cURL
Felix E. Klee
2012-02-06 17:21:58 UTC
Permalink
When requesting data using POST from a certain server (IIS), then
randomly there is no response, not even after 40 seconds
(CURLOPT_TIMEOUT = 40, CURLOPT_CONNECTTIMEOUT = 40).

This *never* happened to me when doing the same request by using a web
browser (filling out a form and submitting it) or with Firefox's Poster
plugin. Unless when using cURL, the server responds in a couple of
seconds at most.

Any idea what could be the source of the problem? Any debugging
suggestions?

The problem occurs with cURL versions:

* 7.21.6 (Ubuntu 11.10)

* 7.19.7 (Amazon LINUX AMI on EC2)

Example server response header:

HTTP/1.1 200 OK
Date: Mon, 06 Feb 2012 15:27:57 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 4.0.30319
Cache-Control: private, max-age=0
Content-Type: application/json; charset=utf-8
Content-Length: 110471

Log of a request without response (CURLOPT_VERBOSE = 1, anonymized):

Feb 06 17:45:36 Data type: 0
Feb 06 17:45:36 Data (64): Re-using existing connection! (#0) with
host example.com

Feb 06 17:45:36 Data type: 0
Feb 06 17:45:36 Data (61): Connected to example.com (1.2.3.4) port 80 (#0)

Feb 06 17:45:36 Data type: 2
Feb 06 17:45:36 Data (299): POST /XYZ.NET/Search.aspx/GetInfo HTTP/1.1
Host: example.com
Accept: */*
Cookie: ASP.NET_SessionId=fj42wfbhs5si4hlfcs5pc3ve;
SOMETHING=2712629952.50480.0000
Accept-Encoding: deflate,gzip
Content-Type: application/json; charset=utf-8
Content-Length: 123


Feb 06 17:45:36 Data type: 4
Feb 06 17:45:36 Data (269):
{"request":{"x":3,"y":4},"sessionKey":"o4or2764-0s6p-8923-nn98-p4p94p60s9p0"}
Feb 06 17:46:16 Data type: 0
Feb 06 17:46:16 Data (77): Operation timed out after 40042
milliseconds with 0 out of -1 bytes received

Feb 06 17:46:16 Data type: 0
Feb 06 17:46:16 Data (22): Closing connection #0
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-07 11:33:41 UTC
Permalink
By the way, if I repeat a failed request, then often or always the
server responds quickly. Also other requests, then seem to work quickly.

Repeat means: *same* request headers and *same* request body (post
data), even same cURL multi handle.
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-07 11:44:27 UTC
Permalink
I read a little bit about other people having issues with IIS sometimes
taking minutes to respond:

http://forums.iis.net/p/1155755/1942669.aspx

Someone speculates about a possible cause of the problem:

FireFox apparently holds connections for 5 minutes and attempts to
reuse them during this time period. This may be longer than other
browsers and may explain why FireFox is the only one that displays
this issue.

Maybe cURL is also holding connections for a long time?

I now set "CURLOPT_FRESH_CONNECT" to 1, just to see if that makes the
problem go away. Could that be a solution? What would be the
disadvantage of setting "CURLOPT_FRESH_CONNECT" to 1?
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-07 11:56:30 UTC
Permalink
Post by Felix E. Klee
I now set "CURLOPT_FRESH_CONNECT" to 1
Setting "CURLOPT_FORBID_REUSE" to 1 should have the same effect, right?

(only that it causes connections to be closed right after the request is
finished)
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Daniel Stenberg
2012-02-07 12:11:09 UTC
Permalink
Post by Felix E. Klee
Post by Felix E. Klee
I now set "CURLOPT_FRESH_CONNECT" to 1
Setting "CURLOPT_FORBID_REUSE" to 1 should have the same effect, right?
CURLOPT_FRESH_CONNECT makes you not re-use any exiting connections from the
pool when you start a new request. It enforces a fresh connect to be done.

CURLOPT_FORBID_REUSE will prevent libcurl from keeping the connection alive
after this request for further re-use in subsequent requests.

Both options can seriously hamper your performance if you intend to do many
requests against the same server.
--
/ daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-07 12:37:15 UTC
Permalink
Post by Daniel Stenberg
Both options can seriously hamper your performance if you intend to do
many requests against the same server.
But what can I do? Seems like cURL tries to reuse connections that IIS
has already closed.

Result of first tests: The options do have the desired effect - no more
timeouts so far.
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Daniel Stenberg
2012-02-07 12:42:48 UTC
Permalink
But what can I do? Seems like cURL tries to reuse connections that IIS has
already closed.
What makes you believe that? You have not provided any logs or hints that this
would be the case.

libcurl has been re-using connections since forever and it generally does not
create problems. I strongly suspect that your problem here is due to the
server end or because of some stupid equipment between you and the server.
Result of first tests: The options do have the desired effect - no more
timeouts so far.
That just makes me even more convinced of where the problem lies...
--
/ daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-07 13:41:39 UTC
Permalink
Post by Daniel Stenberg
Seems like cURL tries to reuse connections that IIS has already
closed.
What makes you believe that?
So far it's just an assumption based on what I've read in the forum
thread I posted before. There people are in charge of the servers, and
at least one solved the problem by increasing the connection timeout
server-side. I am in the oposite position, i.e. in charge of the client
which uses cURL. Setting CURLOPT_FRESH_CONNECT / CURLOPT_FORBID_REUSE to
1 appear to solve the problem, at least according to first tests.
Post by Daniel Stenberg
You have not provided any logs or hints that this would be the case.
What tool do you suggest for creating a log, and what should I look out
for?

I provided a cURL log in my initial post, though that shows only that
there is no server response comming in.
Post by Daniel Stenberg
libcurl has been re-using connections since forever and it generally
does not create problems.
Well, searching for IIS and CURLOPT_FRESH_CONNECT shows that other
people use that option as well when connecting with cURL to IIS. For
example someone writes:

don't know if CURLOPT_FRESH_CONNECT will help - helps IIS with cached
connections.
Post by Daniel Stenberg
I strongly suspect that your problem here is due to the server end or
because of some stupid equipment between you and the server.
It happens in two entirely different setups: on my local development
system (Ubuntu 11.10) and on an Amazon EC2 server. See my initial post.
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Daniel Stenberg
2012-02-08 08:17:02 UTC
Permalink
There people are in charge of the servers, and at least one solved the
problem by increasing the connection timeout server-side.
That's not necessarily "solving" the problem, that is working around the
sympthom unless you back it up with detailed analysis of what actually
happened in the first place and why it doesn't happen after the change.
I am in the oposite position, i.e. in charge of the client which uses cURL.
Setting CURLOPT_FRESH_CONNECT / CURLOPT_FORBID_REUSE to 1 appear to solve
the problem, at least according to first tests.
It works around the issue, yes, and that might be fine and all but it doesn't
help us to pinpoint exactly where the problem actually lies.
Post by Daniel Stenberg
You have not provided any logs or hints that this would be the case.
What tool do you suggest for creating a log, and what should I look out for?
I would propopse wireshark and that you monitor the TCP connection curl sends
its request on. See what happens to it once curl has sent the request and
while curl is waiting for the response.
I provided a cURL log in my initial post, though that shows only that there
is no server response comming in.
Exactly. But that's just the sympthom of the problem from curl's view. It's
not possible to deduce where this problem actually is based on the lack of the
HTTP response received by curl.

curl uses a TCP connection over which it never sees a response. If there is no
response ever coming back then curl is right and something else is to blame.
If the TCP connection has signalled something that curl should've detected
then there's a curl issue. If there's no signal over TCP and just silence,
then the blame is away from the client and moved over the network and quite
probably to the server if you see this problem from different networks against
the same server.
Well, searching for IIS and CURLOPT_FRESH_CONNECT shows that other people
use that option as well when connecting with cURL to IIS. For example
People posting stuff on internet forums don't quite count as evidence, it
would have to be backed up with details.
--
/ daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-08 12:57:00 UTC
Permalink
Post by Daniel Stenberg
I would propopse wireshark and that you monitor the TCP connection
curl sends its request on. See what happens to it once curl has sent
the request and while curl is waiting for the response.
Am running wireshark now and have disabled CURLOPT_FORBID_REUSE /
CURLOPT_FRESH_CONNECT.

Just now a hang occured. The last TCP packet received had flags set to
0x14 (RST, ACK). Is the presence of the RST-flag already an indication
of what may be the problem?
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-02-08 14:00:39 UTC
Permalink
Attached you find a text file which is showing the entire TCP
communication with the server in question.

Wireshark display filter:

ip.addr == 201.77.194.91 && tcp.port == 80

What can be seen here:

1. A POST request is issued.

2. No communication for 40s. (= CURLOPT_TIMEOUT, CURLOPT_CONNECTTIMEOUT)

3. There is brief communication. I assume that cURL tells the server to
give up (FIN, ACK), and the server responds quickly (ACK).

3. No communication for 159s.

3. The server responds with RST, ACK.

So far I saw the problem only twice today, and both times it was the
same pattern:

1. query: no problem

2. long pause of 30 to 60 min

3. another query: timeout hit

Background information: This is web-scraping that I am implementing for
a customer. While my customer is in contact with the airline company
that runs the servers, I doubt that they will change anything especially
for us. According to my customer, they do not even provide an API (which
would make things much more straight forward). So, there is no choice
other than to work around the problem client-side.
Daniel Stenberg
2012-02-10 12:18:35 UTC
Permalink
Post by Felix E. Klee
1. A POST request is issued.
2. No communication for 40s. (= CURLOPT_TIMEOUT, CURLOPT_CONNECTTIMEOUT)
3. There is brief communication. I assume that cURL tells the server to
give up (FIN, ACK), and the server responds quickly (ACK).
This is when libcurl closes the connection, as you also saw that libcurl
legitimately timed out without the server sending anything.

Is the POST data acked by the server?
Post by Felix E. Klee
3. No communication for 159s.
3. The server responds with RST, ACK.
I haven't read the spec lately but it seems odd that it would keep the
connection alive for 160 seconds after it acked the FIN and then send a RST on
it. But maybe I'm just not remembering my TCP details good enough.
Post by Felix E. Klee
So far I saw the problem only twice today, and both times it was the
1. query: no problem
2. long pause of 30 to 60 min
3. another query: timeout hit
So how long pause do you need to make the problem trigger? I would guess that
you see this problem because something silently kills/discards the connection
after N minutes so that if you're just faster than N it'll run fine. You can
possibly overcome this problem by using TCP keep-alive on the connection to
avoid the whatever-it-is-that-kills-the-connection to stop doing it.

That "whatever" can of course be a firewall or load balancer or similar on the
server side.
Post by Felix E. Klee
Background information: This is web-scraping that I am implementing for a
customer. While my customer is in contact with the airline company that runs
the servers, I doubt that they will change anything especially for us.
According to my customer, they do not even provide an API (which would make
things much more straight forward). So, there is no choice other than to
work around the problem client-side.
I'm not saying otherwise - the real world is a rough place and not at all kind
to us who just want to adhere to protocol specs and be fine with it.

I'm just saying that this isn't a problem in libcurl itself, or at least we
haven't seen any evidence of a bug or flaw in libcurl that causes this.
--
/ daniel.haxx.se
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html
Felix E. Klee
2012-03-21 18:01:21 UTC
Permalink
Hi Daniel,

to sum up: Thank you very much for your advise! Unfortunately, I didn't
find the resources to do any further investigation concerning the issue.
If I find out something worthwhile, then I am more than happy to report
it to this list.
Post by Daniel Stenberg
Is the POST data acked by the server?
If I correctly interpret the log that I posted earlier, then: yes
Post by Daniel Stenberg
So how long pause do you need to make the problem trigger?
Don't know yet. That requires further investigation, which I plan to do
once my client gives me a go.

- Felix
-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette: http://curl.haxx.se/mail/etiquette.html

Loading...