[e2e] Agility of RTO Estimates
Detlef Bosau
detlef.bosau at web.de
Fri Jul 15 12:18:38 PDT 2005
Craig Partridge wrote:
> In message <42D7EB1D.8050003 at web.de>, Detlef Bosau writes:
>
>
>>My question is, with repect to mobile wireless networks as UMTS or GPRS:
>>How "quickly" does RTO adapt? I expect, this is restricted by the ES-ES
>>latency, the packet rate (i.e. "samplin rate"), the burstiness of
>>traffic etc.
>>Can this "RTO model" follow e.g. the latency variations met on the
>>mobile network in "real time"?
>>Or are there basic limiations. (At least, I expect so.)
>
>
> I'll take a stab at this and be delighted to be corrected by others who
> know better.
>
> I believe the immediate issue is not the "RTO model" but rather the
> question of what RTO estimator you use. In the late 1980s there was
Basically, it´s the same question. Maybe, I was confusing there.
The "RTO model" consists of
1. the RTT estimate,
2. the variation estimate,
3. the "recipe", how you "cook" a confidence interval from those parameters.
> a crisis of confidence in RTO estimators -- a problem we dealt with by
> developing Karn's algorithm (to deal with retransmission ambiguity) and
> improving the RSRE estimation algorithm with Van Jacobson's replacement.
>
> Van did a bunch of testing of his estimator on real Internet traffic and
> looked to see how often the estimator failed. (Note that spurious
> timeouts are only one failure -- delaying a retransmission overly long
> after the loss is also a failure.) He picked an estimator that was
> easy to compute and gave good results in the real world.
>
> If there's reason to believe the estimator today is working less well, we
> could obviously replace it. That doesn't mean the RTO model needs fixing.
I don´t want to fix the RTO model itself - just to be not misunderstood.
I only want to understand the basic limiations. E.g.: The RTT estimate
("SRTT") _has_ to rely on a certain time series of RTT observations
taken from the flow.
Similar to the sentence: "Nehmen Se de Menschen, wie se sind. Andere
jibt et nich.", Konrad Adenauer :-) Or in english (hopefully, the
translation is not too bad): Adenauer advised us: Take the people as
they are. There are no others.
BTW: Sp. T. and delaying a retransmission overly long are basically the
same problem. In each statistical test, you have two kinds of errors.
The ones of the first kind: Falsely reject a correct zero hypothesis.
If your z.h. is "The packet is correctly delivered and acknowledged", a
sp. t. is an error of the first kind.
Then, there are the ones of the second kind: Falsely "accept" (precisely
"not reject", because tests make a decision whether or not to reject a
z.hyp.) a wrong zero hypothesis.
Back to RTT estimators.
You have to rely on a certain time series. Depending at least on your
throughput, this series is restricted to a certain "sampling rate". From
this, the resolution of the estimator, i.e. it´s ability to follow
network property changes in their original bandwidth is limited.
A concrecte example: Properties of an UMTS channel may change extremely
quickly. The transport latency for a radio block may change several
times even _within_ one IP packet (which may be split into severyl RB
for transmsission). Thus the end-to-end latency for a packet will change
several times within one packet transmission.
It is obvious that a RTT estimate _cannot_ follow these changes,
independent of the chosen estimator.
(It is a very rough analogy, but I always think of Shannon´s sampling
theorem here.)
>
> Second point is that the RTO model now works in concert with other
> mechanisms. I.e. it used to be that we relied only on RTO to determine
> if we should retransmit. Now we have Fast Retransmit to catch certain
> types of loss.
>
...which raises other questions of course, e.g. the question whether
packet reordering is neglectible or not.
However, for the moment I don´t think about that.
The underlying question in fact is: When I could place a bandwidth
restriction upon network property changes (don´t ask me how ;-), but for
the moment, let´s assume I could), which restriction would be enough to
allow RTT and variation estimators to follow network properties "quickly
enough"? I.e. to keep the risk of spurious timeouts etc. at a constant
level?
Please note: I do not say _avoid_ here, because in a test, the level of
significance _is_ the propability for an error of the first kind.
Particularly for spurious timeouts, that means these are not restricted
to wireless network but are an inherent (and inevitable!) part of TCP
which is met on _all_ networks.
In other words: What (bandwidth) restrictions must be eonforced on
network properties, to maintain a "constant" level of significance for
the RTO test here?
I think about this for weeks now and sometimes, I fear that I have to
rely only simulations on this one. And I must reveal a secret here: I
hate simulations. Not only, that simulations can "prove" everything and
nothing - but sometimes I fear that the NS2 is for networks what Google
is for reality.....
(Not to be misunderstood: A well done simulation may provide useful
insight. However, it does not replace a thorough rationale for proposed
mechanisms.)
Detlef Bosau
--
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937
More information about the end2end-interest
mailing list