[e2e] Agility of RTO Estimates, stability, vulneratibilites
Detlef Bosau
detlef.bosau at web.de
Sun Jul 24 02:08:55 PDT 2005
I´m somwhat confused that apparently hardly anyone is interested in this
topic. Perhaps, it´s a stupid one. Then, please explain to me why.
Perhaps, I did not pose my questions clear enough. I will give it
another try.
Q1: What is the semantics of RTO? Is it correct to see RTO as a
confidence intervall for the RTT?
If so, I´m particulary confused about quite a couple of papers
concerning "spurious timeouts". Sometimes, I got the impression that
spurious timeouts are some "strange phenomenon" which was "detected" by
chance or by accident. I don´t know.
In addition, years ago a professort told me the formula RTO=RTT+2VAR was
found by "probing", "experiments".
O.k. He is professor, not me.. So he must be right there ;-)
So once again: Is RTO commonly seen as a confidence intervall for RTT or
not?
Craig wrote:
>> I believe the immediate issue is not the "RTO model" but rather the
>> question of what RTO estimator you use. In the late 1980s there was
>> a crisis of confidence in RTO estimators -- a problem we dealt with by
What´s the meaning of confidence here?
I use "confidence interval" in its mathematical sense. An interval I is
a p confidence interfal vor a stochastic variable X, if an instance x of
X is in the interval I with propability p.
To my understanding, it´s important for competing TCP flows to use
similar or equal confidence intervals here, otherwise we hardly would
achieve fairness.
So, we have basically two issues here.
The first one is the robustness issue. How robust are RTO/RTT/.. estimates?
I don´t want to discuss this here, because this is bascially no TCP
related question. It´s the question if it is at least _possible_ to
estimate a RTT. And this is a requirement for the network itself and its
structure. To my knowledge, there are quite a few papers around dealing
with "self similarity". I´m not quite sure, but if e2e latencies were in
fact "self similar" (I use "" here because the term "self similar" is
sometimes used without a satisfactory mathematical definition), we could
stop the discussion here. In that case, there would be hardly any chance
to have acceptable RTT estimates. (I´m no expert here, but estimators
often converge due to the SLLN or similar theorems and there is an
assumption "i.i.d." in it. Identically and _independently_ distributed.
In a self similar series of stochastic variables I _strongly_ doubt
their independence.)
So, at least _one_ assumption for a network is inevitable in order to
use sender initiated, timeout based retransmission: Convergent
estimators for the timeout must exist.
Unfortunaly, a priori we do not know about possible limitaions of RTT,
particularly there is no general upper limit. So it is somewhat
cumbersome to derive an 1-alpha confidence interval directly from the
sample here. In fact, it is a common approach in statistics, to derive
confidence intervals from estimates for expectation and variation of a
stochastic variable. Often there is some implicit assumption about the
districution function of this varible, e.g. gaussian.
So, if whe use RTT and VAR (as we do in TCP), we implictly assume that
estimators for RTT and VAR _exist_.
But in principle, these estimators are not defined by TCP, they are
_assumed_ by TCP. Bascially, we _assume_ the existence of a RTT/VAR/RTO
estimators here and then we use them. And hopefully, we use appropriate
ones for the packet switched network in use.
So once again and very short:
The RTO used in TCP is a confidence interval for RTT.
TCP _assumes_ (if implicitly) the existence of a reasonable RTO estimator.
Is this correct?
O.k.
Then the next steps are:
-Identification of a gerenic estimator, if possible.
-Identification and elimination of vulnerabilities.
>> developing Karn's algorithm (to deal with retransmission ambiguity) and
>> improving the RSRE estimation algorithm with Van Jacobson's replacement.
O.k. Let´s ignore the retransmission ambiguity for the moment.
(An easy way to overcome this would be to mark each TCP datagram sent
with a unique identifier, which is reflected by the according ACK.
Particularly, if a TCP datagram is sent more than once, it would be
given a different identifier each time it is sent.AFAIK this is the
rationale behaind the "sequence number" in ICMP.)
Q2: What are other vulnerabilities and implicit assumptions?
-Are there assumptions concerning the latency distribution?
-Are there assumptions concerning the latency _stability_? What about
latency oscillations?
In other words: What is the system model behind the RTT estimators used
in TCP?
What are the _requirements_ for TCP to work properly? Can we make
implicit assumptions explicit?
Which requirements must be met by a network so that TCP can work without
problems?
Is this question stupid? If not: Is there existing work on this issue?
If so, I would appreciate any hint.
Detlef Bosau
--
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937
More information about the end2end-interest
mailing list