[e2e] Question about propagation and queuing delays

Wed Aug 24 02:36:47 PDT 2005

So I am sitting in a meeting room at APAN, which is meeting in  
Taipei. I happen to be VPN'd into Cisco in San Jose, but I shut that  
down to develop a traceroute for your benefit.

The traceroute from here to Cisco is:

traceroute to irp-view7.cisco.com (171.70.65.144), 64 hops max, 40  
byte packets
1  ip-242-001 (140.109.242.1)  8.177 ms  10.311 ms  16.018 ms
2  ae-0-10.br0.tpe.tw.rt.ascc.net (140.109.251.50)  2.096 ms  66.035  
ms  49.755 ms
3  s4-1-1-0.br0.pax.us.rt.ascc.net (140.109.251.105)  206.316 ms   
162.307 ms  259.891 ms
4  so-5-1.hsa4.sanjose1.level3.net (64.152.81.9)  130.915 ms  274.471  
ms  304.699 ms
5  so-2-1-0.bbr2.sanjose1.level3.net (4.68.114.157)  132.229 ms   
176.587 ms  135.330 ms
6  ge-11-0.ipcolo1.sanjose1.level3.net (4.68.123.41)  134.507 ms  
ge-11-2.ipcolo1.sanjose1.level3.net (4.68.123.169)  131.669 ms  
ge-11-0.ipcolo1.sanjose1.level3.net (4.68.123.41)  134.544 ms
7  p1-0.cisco.bbnplanet.net (4.0.26.14)  130.734 ms  131.757 ms   
140.291 ms
8  sjck-dmzbb-gw1.cisco.com (128.107.239.9)  146.848 ms  132.394 ms   
168.201 ms
...

I ran a ping (through the VPN) to a server inside Cisco. While I did  
that, I downloaded a number of files. The variation in ping delay is:

225 packets transmitted, 222 packets received, 1% packet loss
round-trip min/avg/max/stddev = 132.565/571.710/2167.062/441.876 ms

The peak rate sftp reported was about 141.3 KB/s, and the least rate  
was 34.2 KB/s. The difference most likely relates to the effects of  
packet loss (1.3% loss is non-negligible), delay variation (a  
standard deviation in ping RTT of 442 ms and an absolute variation in  
delay of 2034 ms are also non-negligible), the effects of slow-start  
and fast-retransmit procedures, or the bandwidth remaining while  
other users also made use of the link.

What this demonstrates is the variation in delay that happens around  
bottlenecks in the Internet, and why folks that worry about TCP/SCTP  
congestion management procedures are not playing with recreational  
pharmaceuticals. I won't speculate where this bottleneck is beyond  
saying I'll bet it's in one of the first few hops of that traceroute  
- the access path.

On Aug 23, 2005, at 5:50 AM, Fred Baker wrote:

> no, but there are different realities, and how one measures them is  
> also relevant.
>
> In large fiber backbones, within the backbone we generally run 10:1  
> overprovisioned or more. within those backbones, as you note, the  
> discussion is moot. But not all traffic stays within the cores of  
> large fiber backbones - much of it is originated and terminates in  
> end systems located in homes and offices.
>
> The networks that connect homes and offices to the backbones are  
> often constrained differently. For example, my home (in an affluent  
> community in California) is connected by Cable Modem, and the  
> service that I buy (business service that in its AUP accepts a VPN,  
> unlike the same company's residential service) guarantees a certain  
> amount of bandwidth, and constrains me to that bandwidth - measured  
> in KBPS. I can pretty easily fill that, and when I do certain  
> services like VoIP don't work anywhere near as well. So I wind up  
> playing with the queuing of traffic in the router in my home to  
> work around the service rate limit in my ISP. As I type this  
> morning (in a hotel in Taipei), the hotel provides an access  
> network that I share with the other occupants of the hotel. It's  
> not uncommon for the entire hotel to share a single path for all of  
> its occupants, and that single path is not necessarily in MBPS.  
> And, they tell me that the entire world is not connected by large  
> fiber cores - as soon as you step out of the affluent  
> industrialized countries, VSAT, 64 KBPS links, and even 9.6 access  
> over GSM become the access paths available.
>
> As to measurement, note that we generally measure that  
> overprovisioning by running MRTG and sampling throughput rates  
> every 300 seconds. When you're discussing general service levels  
> for an ISP, that is probably reasonable. When you're measuring time  
> variations on the order of milliseconds, that's a little like  
> running a bump counter cable across a busy intersection in your  
> favorite downtown, reading the counter once a day, and drawing  
> inferences about the behavior of traffic during light changes  
> during rush hour...
>
> http://www.ieee-infocom.org/2004/Papers/37_4.PDF has an interesting  
> data point. They used a much better measurement methodology, and  
> one of the large networks gave them some pretty cool access in  
> order to make those tests. Basically, queuing delays within that  
> particular very-well-engineered large fiber core were on the order  
> of 1 ms or less during the study, with very high confidence. But  
> the same data flows frequently jumped into the 10 ms range even  
> within the 90% confidence interval, and a few times jumped to 100  
> ms or so. The jumps to high delays would most likely relate to  
> correlated high volume data flows, I suspect, either due to route  
> changes or simple high traffic volume.
>
> The people on NANOG and the people in the NRENs live in a certain  
> ivory tower, and have little patience with those who don't. They  
> also measure the world in a certain way that is easy for them.
>
>
> On Aug 23, 2005, at 12:13 AM, David Hagel wrote:
>
>
>> Thanks, this is interesting. I asked the same question on nanog  
>> and got similar responses: that queuing delay is negligible on  
>> todays backbone networks compared to other fixed delay components  
>> (propagation, store-and-forward, transmission etc). Response on  
>> nanog seems to indicate that queuing delay is almost irrelevant  
>> today.
>>
>> This may sound like a naive question. But if queuing delays are so  
>> insignificant in comparison to other fixed delay components then  
>> what does it say about the usefulness of all the extensive  
>> techniques for queue management and congestion control (including  
>> TCP congestion control, RED and so forth) in the context of  
>> today's backbone networks? Any thoughts? Are the congestion  
>> control researchers out of touch with reality?
>>
>> - Dave
>>
>> On 8/21/05, David P. Reed <dpreed at reed.com> wrote:
>>
>>> I can repeatably easily measure 40 msec. coast-to-coast (Boston- 
>>> LA), of which around 25 msec. is accounted for by speed of light  
>>> in fiber (which is 2/3 of speed of light in vacuum, *299,792,458  
>>> m s^-1 *, because the refractive index of fiber is approximately  
>>> 1.5 or 3/2).   So assume 2e8 m/s as the speed of light in fiber,   
>>> 1.6e3 m/mile, and you get 1.25e5 mi/sec.
>>>
>>> The remaining 15 msec. can be accounted for by the fiber path not  
>>> being straight line, or by various "buffering delays" (which  
>>> include queueing delays, and scheduling delays in the case where  
>>> frames are scheduled periodically and you have to wait for the  
>>> next frame time to launch your frame).
>>>
>>> Craig Partridge and I have debated (offline) what the breakdown  
>>> might actually turn out to be (he thinks the total buffering  
>>> delay is only 2-3 msec., I think it's more like 10-12), and it  
>>> would be quite interesting to get more details, but that would  
>>> involve delving into the actual equipment deployed and its  
>>> operating modes.
>>>
>
>