[e2e] Agility of RTO Estimates, stability, vulneratibilites

Wed Aug 3 12:54:50 PDT 2005

Christian Huitema wrote:
> I think we should just look at a simple question. Does the current
> algorithm actually works? 
> 
> I personally did measurements 6 years ago. The measurement of
> tcp-connect times to various web servers clearly showed a power law
> distribution. There is in fact a history of finding power laws in
> measurement of communication systems. In fact, Mandelbrot work on
> fractals started with an analysis of the distribution of errors on a
> modem link! Based on all that, it is quite reasonable to assume that the
> distribution of RTT measurement follows a power law. 
> 

Hm. I believe I remember some newspaper article, where the origin for 
the work on "the fractal geometry of nature" was the question: How long 
is the coast of England?

Shortly afterwards, we typically learn that a butterly in the Himalaya 
may cause a tornado in Europe.

When I attendet lessons in stochastics, I was told: When we think there 
may be a stochastic behaviour, we must consider where the stochastic 
behaviour is supposed to come from. Do we really _expect_ this 
behaviour? And why do we?

It´s the same with the whole thing of chaos theory, self similarity and 
its variations.

1.: What does it describe exactly? (I frequently miss precise definitions.)
2.: Where does chaotic/self-similar/.... behaviour come from? (It´s not 
enough to list up occasional observations. Why it´s reasonable to assume 
a hehaviour like that? Is there hard evidence, that e.g. latencies are 
self similar?
3.: What do we learn from that behaviour? Does it end in itself? Or can 
we really tell about "lessons learned" from the self similarity debate?

> People will immediately mention that it should be a truncated power law,
> but even that is far from clear. There is at least anecdotal evidence of
> packets being held up in queues and then transmitted after a very long
> time, e.g. half an hour...
> 

Does not sound like a solid basis.

> The current RTT estimators are based on exponential averages of
> consecutive samples of delays and variations. This is an issue, as the
> exponential average of a heavy tailed distribution also is a heavy
> tailed distribution. If you plug that in a simulation, you will observe
> that the estimates behave erratically. 

O.k. After having played around for a few minutes with EWMA filters in 
Octave, I´ve seen that even the settling behaviour is simply disastrous.

When we keep in mind that Internet latencies vary from some microseconds 
  (10e-6) in an Ethernet segment to some hundred _seconds_ (sic!) in 
some mobile wireless networks (10e2) then we see that Internet latenies 
vary on a scale covering at least eight orders of magnitude.

When we keep in mind further, that the Internet is dominated by short 
term flows (20 packets or so), then we must conclude, that an ordinary 
TCP flow is quite unlikely to see even _one_ correct RTT estimate in its 
whole lifetime.

Is this correct?

Now, to my knowledge, we use an initial value about 2 seconds, which is 
a reasonable upper limit for quite a few internet connections and 
therefore, during a flow´s lifetime some few distracting RTT 
measurements do not really matter.

So, from a "practicioner´s view", TCP "works". "Somehow".

However, as soon as we are confronted with latencies larger than this 
initial value or subject to variation on a large scale, the situation 
deteriorates.

> 
> My personal feeling is that the current RTT estimators do not actually
> work.
> 

What should be considered bad news ...

However, I would like to focus the problem a little bit more on a hone 
hop scenario. The reason for doing so is, that after having read the 
works by Zhang, Jain and Edge two problems become evident.

1.: RTT estimators suffer from poor convergence and a problem with their 
initial value.

2.: RTT estimators suffer from a poor forecast capability.

Their are numerous other difficulties, e.g. Edge´s assumptions, however 
I think these can be handled as well as 1.

The hard problem may be 2.

Let´s consider one hop. Two routers, r1 and r2, one link in between.
(Routers: These systems may well be IS in an arbitrary network path.)

r1--------------------------r2

Consider one packet.

t1: packet´s arrival time on r1.
t2: packet´s arrival time on r2.

If a packet is yet to arrive on r1 latest at a time now + delta and 
delta is known, can we forecast the estimation of (t2-t1)?

-- 
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937