[e2e] Can we revive T/TCP ? => persistent connections
David Andersen
dga+ at cs.cmu.edu
Mon Dec 26 13:10:41 PST 2005
On Dec 26, 2005, at 3:00 PM, Michael Welzl wrote:
>>
>> It doesn't. Most links are clicked within the same site, and most
>> servers and browsers support persistent connections. The connection
>> is only torn down after an idle period or some maximum number of
>> requests.
>
> In practice, this doesn't seem to be the case. In all the tests my
> students did (not a thorough measurement study, just some
> experiments), the server closed the connection after sending a page.
>
> I think this is due to the (quasi-)stateless operation that a HTTP
> server
> can achieve this way - I mean, it's much more difficult to keep
> connections open for a longer period, and close them only after a
> timer expired, count the number of connections that should be cached,
> etc. etc. ... if poorly implemented, this might also not scale so
> well.
Could you elaborate on how you did those tests? A quick, highly
scientific
check showed:
www.yahoo.com: no connection caching
Google: caching
microsoft.com: caching
www.cmu.edu: caching
The connection timeouts on some of those are fairly short; some are
long enough for subsequent clicks, many are only long enough to fetch
embedded objects (a few seconds).
>
>
>> (I'm sure there are scenarios where it will, of course.) In the Grid
>> context, if you're talking about a not-huge set of trusted nodes,
>> they can cache those TCP connections for quite a long time.
>
> But they don't - neither in the smaller nor in the larger Grids that I
> know of; I think it's because the notion of a "connection" is lost
> in the (vertical) communication across layers.
I'd suggest that that's not the fault of T/TCP, but the fault of the
upper layers in the architecture...
>
> Grid Services are usually implemented on top of SOAP, which is
> stateless. How should SOAP tell HTTP to maintain a connection
> when it can't know whether a Grid Service will be called again? The
> decision to do so is up to the programmer, who however can't provide
> the remote SOAP instance with the necessary information because
> the notion of a "session" isn't part of SOAP.
>
> Could connections be cached in a transparent manner in such a
> scenario (e.g. by tweaking something at the HTTP level, but not
> above)? I think so, but I'm not 100% sure. Also, if it's possible,
> why isn't it done? In a Grid, this would surely make sense.
Yes. Some SOAP and XMLRPC libraries do this. See, e.g.,
http://www.gnuenterprise.org/tools/common/docs/api/public/
"gnue.common.rpc.drivers.xmlrpc.ClientAdapter.ClientAdapter:
Implements an XML-RPC client adapter using persistent HTTP
connections as transport."
>
>
>> An interesting example of this is the 'rex' system by Kaminsky and
>> Mazieres. It's a remote execution tool much like ssh, but more
>> flexible. It supports connection caching under the hood, so you
>> don't have to pay the setup time if you're using remote command
>> execution. It's worth noting that the major delay they're avoiding
>> in the local area is the public key crypto processing time, but in
>> the wide-area, both can add significantly to the total delay.
>
> Thanks a lot for the pointer!
> By "under the hood", you don't mean it's transparent to upper
> layers, do you? How could it... I mean, if a web server decides
> to close a connection, there's nothing any system underneath it
> could do about it, I guess.
"under the hood" -- underneath what the upper layers see. The web
server can close the connection, and the client
{library,binary,whatever} can open it up again without having to let
the user's program running on top of it know what's going on.
>
> I heard the term "connection caching" before, and followed it, which
> led to a few papers on the subject and problems with this type of
> caching, but no standards. It doesn't seem to be an easy issue, but
> it looks like it's solvable. If I'm right and common web servers don't
> implement this (one could of course carry out a larger measurement
> study for this... perhaps it has already been done), wouldn't an
> Informational RFC which provides an overview of connection caching
> methods and suggests an implementation do the trick?
I believe you're mistaken. Most web servers support it. It's part
of the HTTP 1.1 spec, and has been around literally for years.
>
> I'd be thankful for some pointers to the key papers about connection
> caching - e.g., where was it introduced?
Proposed: 1995 sigcomm, Mogul, "The Case for Persistent-Connection
HTTP". Dig around in some of his other papers, you'll get a good
feel for what's going on.
HTTP 1.1 spec. Persistent is the default.
HTTP 1.0 hack, the:
connection: keep-alive
header.
-d
More information about the end2end-interest
mailing list