[e2e] Are we doing sliding window in the Internet?

Tue Jan 2 10:31:29 PST 2007

A few months back we collected some
per-connection data in both client
and server modes. We thought you might
be interested in the preliminary results.

We collected data in two modes/configurations.
In the client mode we configured Apache to
be a web proxy  and in the server mode we
configured Apache to serve an actual website.
The basic results, which must be only considered
as being indicative/hints of the reality, are
as follows:

Server end (i.e, end that has large
amount of data to transfer):

     - Most connections are short (90% < 1sec)
     - MaxCwnd is < 5KB in > 80% of cases
     - MaxRTT is distributed almost uniformly
       in the 0-400ms range.

Client end (i.e., the end receiving data):

     - ~ 90% of connections see MaxCwnd < 5KB
     - < 1% connections see MaxCwnd > 10KB
     - 90% of connections have MaxRTT < 100ms

There are some problems with the data:

     - limited scenarios (web based)
     - small sample sizes (21K for server, 150K
       for client)
     - the website has non-standard distribution
       of file types and sizes

You can find the various graphs here:
http://www.isi.edu/aln/e2e.ppt

Venkata Pingali
http://www.isi.edu/aln

Fred Baker wrote:
> yes and no.
> 
> A large percentage of sessions are very short - count the bytes in this 
> email and consider how many TCP segments are required to carry it, for 
> example, or look through your web cache to see the sizes of objects it 
> stores. We are doing the sliding window algorithm, but it cuts very 
> short when the TCP session abruptly closes.
> 
> For longer exchanges - p2p and many others - yes, we indeed do sliding 
> window.
> 
> I don't see any reason to believe that TCPs tune themselves to have 
> exactly RTT/MSS segments outstanding. That would be the optimal number 
> to have ourstanding, but generally they will have the smallest of { the 
> offered window, the sender's maximum window, and the used window at 
> which they start dropping traffic }. If they never see loss, they can 
> keep an incredibly large amount of data outstanding regardless of the 
> values of RTT and MSS.
> 
> I wonder where you got the notion that a typical session had a 10 ms 
> RTT. In a LAN environment where the servers are in the same building, 
> that is probably the case. But consider these rather more typical 
> examples: across my VPN to a machine at work, across the US to MIT, and 
> across the Atlantic to you:
> 
> [stealth-10-32-244-218:~] fred% traceroute irp-view7
> traceroute to irp-view7.cisco.com (171.70.65.144), 64 hops max, 40 byte 
> packets
> 1  fred-vpn (10.32.244.217)  1.486 ms  1.047 ms  1.034 ms
> 2  n003-000-000-000.static.ge.com (3.7.12.1)  22.360 ms  20.962 ms  
> 22.194 ms
> 3  10.34.251.137 (10.34.251.137)  23.559 ms  22.586 ms  22.236 ms
> 4  sjc20-a5-gw2 (10.34.250.78)  21.465 ms  22.544 ms  20.748 ms
> 5  sjc20-sbb5-gw1 (128.107.180.105)  22.294 ms  22.351 ms  22.803 ms
> 6  sjc20-rbb-gw5 (128.107.180.22)  21.583 ms  22.517 ms  24.190 ms
> 7  sjc12-rbb-gw4 (128.107.180.2)  22.115 ms  23.143 ms  21.478 ms
> 8  sjc5-sbb4-gw1 (171.71.241.253)  26.550 ms  23.122 ms  21.569 ms
> 9  sjc12-dc5-gw2 (171.71.241.66)  22.115 ms  22.435 ms  22.185 ms
> 10  sjc5-dc3-gw2 (171.71.243.46)  22.031 ms  21.846 ms  22.185 ms
> 11  irp-view7 (171.70.65.144)  22.760 ms  22.912 ms  21.941 ms
> 
> [stealth-10-32-244-218:~] fred% traceroute www.mit.edu
> traceroute to www.mit.edu (18.7.22.83), 64 hops max, 40 byte packets
> 1  fred-vpn (10.32.244.217)  1.468 ms  1.108 ms  1.083 ms
> 2  172.16.16.1 (172.16.16.1)  11.994 ms  10.351 ms  10.858 ms
> 3  cbshost-68-111-47-251.sbcox.net (68.111.47.251)  9.238 ms  19.517 ms  
> 9.857 ms
> 4  12.125.98.101 (12.125.98.101)  11.849 ms  11.913 ms  12.086 ms
> 5  gbr1-p100.la2ca.ip.att.net (12.123.28.130)  12.348 ms  11.736 ms  
> 12.891 ms
> 6  tbr2-p013502.la2ca.ip.att.net (12.122.11.145)  15.071 ms  13.462 ms  
> 13.453 ms
> 7  12.127.3.221 (12.127.3.221)  12.643 ms  13.761 ms  14.345 ms
> 8  br1-a3110s9.attga.ip.att.net (192.205.33.230)  13.842 ms  12.414 ms  
> 12.647 ms
> 9  ae-32-54.ebr2.losangeles1.level3.net (4.68.102.126)  16.651 ms 
> ae-32-56.ebr2.losangeles1.level3.net (4.68.102.190)  20.154 ms *
> 10  * * *
> 11  ae-2.ebr1.sanjose1.level3.net (4.69.132.9)  28.222 ms  24.319 ms 
> ae-1-100.ebr2.sanjose1.level3.net (4.69.132.2)  35.417 ms
> 12  ae-1-100.ebr2.sanjose1.level3.net (4.69.132.2)  25.640 ms  22.567 ms *
> 13  ae-3.ebr1.denver1.level3.net (4.69.132.58)  52.275 ms  60.821 ms  
> 54.384 ms
> 14  ae-3.ebr1.chicago1.level3.net (4.69.132.62)  68.285 ms 
> ae-1-100.ebr2.denver1.level3.net (4.69.132.38)  59.113 ms  68.779 ms
> 15  * * *
> 16  * ae-7-7.car1.boston1.level3.net (4.69.132.241)  94.977 ms *
> 17  ae-7-7.car1.boston1.level3.net (4.69.132.241)  95.821 ms 
> ae-11-11.car2.boston1.level3.net (4.69.132.246)  93.856 ms 
> ae-7-7.car1.boston1.level3.net (4.69.132.241)  96.735 ms
> 18  ae-11-11.car2.boston1.level3.net (4.69.132.246)  91.093 ms  92.125 
> ms 4.79.2.2 (4.79.2.2)  95.802 ms
> 19  4.79.2.2 (4.79.2.2)  93.945 ms  95.336 ms  97.301 ms
> 20  w92-rtr-1-backbone.mit.edu (18.168.0.25)  98.246 ms www.mit.edu 
> (18.7.22.83)  93.657 ms w92-rtr-1-backbone.mit.edu (18.168.0.25)  92.610 ms
> 
> [stealth-10-32-244-218:~] fred% traceroute web.de
> traceroute to web.de (217.72.195.42), 64 hops max, 40 byte packets
> 1  fred-vpn (10.32.244.217)  1.482 ms  1.078 ms  1.093 ms
> 2  172.16.16.1 (172.16.16.1)  12.131 ms  9.318 ms  8.140 ms
> 3  cbshost-68-111-47-251.sbcox.net (68.111.47.251)  10.790 ms  9.051 ms  
> 10.564 ms
> 4  12.125.98.101 (12.125.98.101)  13.580 ms  21.643 ms  12.206 ms
> 5  gbr2-p100.la2ca.ip.att.net (12.123.28.134)  12.446 ms  12.914 ms  
> 12.006 ms
> 6  tbr2-p013602.la2ca.ip.att.net (12.122.11.149)  13.463 ms  12.711 ms  
> 12.187 ms
> 7  12.127.3.213 (12.127.3.213)  185.324 ms  11.845 ms  12.189 ms
> 8  192.205.33.226 (192.205.33.226)  12.008 ms  11.665 ms  25.390 ms
> 9  ae-1-53.bbr1.losangeles1.level3.net (4.68.102.65)  13.695 ms 
> ae-1-51.bbr1.losangeles1.level3.net (4.68.102.1)  11.645 ms 
> ae-1-53.bbr1.losangeles1.level3.net (4.68.102.65)  12.517 ms
> 10  ae-1-0.bbr1.frankfurt1.level3.net (212.187.128.30)  171.886 ms 
> as-2-0.bbr2.frankfurt1.level3.net (4.68.128.169)  167.640 ms  168.895 ms
> 11  ge-10-0.ipcolo1.frankfurt1.level3.net (4.68.118.9)  170.336 ms 
> ge-11-1.ipcolo1.frankfurt1.level3.net (4.68.118.105)  174.211 ms 
> ge-10-1.ipcolo1.frankfurt1.level3.net (4.68.118.73)  169.730 ms
> 12  gw-megaspace.frankfurt.eu.level3.net (212.162.44.158)  169.276 ms  
> 170.110 ms  168.099 ms
> 13  te-2-3.gw-backbone-d.bs.ka.schlund.net (212.227.120.17)  171.412 ms  
> 171.820 ms  170.265 ms
> 14  a0kac2.gw-distwe-a.bs.ka.schlund.net (212.227.121.218)  175.416 ms  
> 173.653 ms  174.007 ms
> 15  ha-42.web.de (217.72.195.42)  174.908 ms  174.921 ms  175.821 ms
> 
> 
> On Dec 31, 2006, at 11:15 AM, Detlef Bosau wrote:
> 
>> Happy New Year, Miss Sophy My Dear!
>>
>> (Although this sketch is in Englisch, it is hardly known outside 
>> Germay to my knowledge.)
>>
>> I wonder whether we´re really doing sliding window in TCP connections 
>> all the time or whether a number of connections have congestion 
>> windows of only one segment, i.e. behave like stop´n wait in reality.
>>
>> When I assume an  Ethernet like MTU, i.e. 1500 byte = 12000 bit, and 
>> 10 ms RTT the throughput is roughly 12000 bit / 10 ms = 1.2 Mbps.
>>
>> From this I would expect that in quite a few cases a TCP connection 
>> will have a congestion window of 1 MSS or even less.
>>
>> In addition, some weeks ago I read a paper, I don´t remember were, 
>> that we should reconsider and perhaps resize our MTUs to larger values 
>> for networks with large bandwidth. The rationale was simply as 
>> follows: The MTU size is always a tradeoff between overhead and 
>> jitter. From Ethernet we know that we can accept a maximum packet 
>> duration of 12000 bit / (10 Mbps) = 1.2 ms  and the resultig jitter. 
>> For Gigabit Ethernet
>> a maximum packet duration of 1.2 ms would result in a MTU size of 1500 
>> kbyte = 1.5 Mbyte.
>>
>> If so, we would see "stop´n wait like" connections much more 
>> frequently than today.
>>
>> Is this view correct?
>>