[e2e] Expected latency for a single hop
Detlef Bosau
detlef.bosau at web.de
Thu Aug 4 11:27:27 PDT 2005
David P. Reed wrote:
> Detlef - Though it seems simple, your statement is about as complex as a
> problem can be.
> This is the kind of problem statement that creates the definitional trap
> I was referring to in earlier discussions. By construing the "latency"
> as being a propery of the "link" rather than of the network as a whole,
> the statement acquires a misleading simplicity
>
I know. However, the rationale behind my question is quite obvious: If
you place a TCP sender at n1 and the according receiver at n2, the
adaptive RTO mechanism in TCP exactly relys upon estimated mean and
variance of (t2-t1).
If I had written: "Can we provide an adaptive RTO for a single hop TCP
connection?" I surely had been directed to the relevant literature.
Perhaps, one had considered it a stupid question.
Thus, I thought it might be useful to state the same problem(sic!) which
TCP claims to solve (even for n hops!) in somewhat different words >:-)
Honestly, I believe, if we cannot estimate mean and variance for a
_single_ hop, it´s perhaps not that much easier to do the job for an
arbitrary number of hops (which of course includes the nasty case of a
single hop).
> The latency only is well defined for real packets that actually arrive
> and traverse the link. Expectation and variance are properties of
> distributions, not packets.
>
Yes. The intention of EWMA filtering used in TCP is an attempt to do a
parameterless forecast of mean and variance the actual latency distribution.
(Some weeks ago someone refered me to this well known saying by Niels
Bohr: "Prediction is hard, especially of the future.")
Thus, of course we estimate properties of an unknown (!) distribution
and in turn derive an RTO by application of an inequality similar to
Chebyshev´s inequality.
However, the basic assumption is that we can provide estimates for mean
and variance of a packets round trip time.
> There is no random process at all on the link itself (at least in the
> common case - there are links where the link itself has a random delay,
> but that usually arises where the link's physical characteristics vary
> faster and larger than the queue management and link pacing
My assumptions on l are tough. I totally agree with Craig here. In
general, we do know nothing about l. It may be a tunnel, it may be a
mobile wireless link. E.g. for mobile wireless links, I do not even know
whether a finite variance for the link´s latency distribution exists.
> mechanisms). The random process is the network environment that
> provides competing packets. So the latency is everywhere but the link
> itself.
>
> The other issue is that prediction is more reliable over a collection of
> packets, but a sufficient collection cannot happen in an instant.
>
I did not make any assumptions here, especially I did not assume that
the estimation should be based upon the observation of the one packet.
Perhaps my formulation was somewhat misleading here.
We could use testpackets sent from n1 to n2 or observe traffic from
several flows. If n1 and n2 are routers and we can observe a large
number of flows, the job should be much easier than if done at a TCP
flows source which has to rely on a _very_ rough sample.
My favourate expample is always TCP including a 2400 bps link. (Nowadays
forgotten, two years ago known from GSM - and we all know it from the
good old modem times.) Depending on MSS, the sender gets a sample every
one second or so. For wirebound systems in between one second is _ages_.
In one of my posts, I claimed (please correct me, if I´m wrong) that in
contemporary networks, even link bandwidths cover a range of eight
orders of magnitude. When our single packet crawls along a GSM link,
a Tier 1 backbone link may convey the whole Encyclopedia Britannica
within the same period of time.
However, in a scenario
Sender----Tier1/Enc.Brit. Link--------router----GSM---receiver
the sender estimates mean and variance of the round trip time using EWMA
filters and the extremely rough time series gained from the ACK packets.
Question: Is there a justification for doing so?
I looked at Edge´s paper, esecially for the assumptions for the
observation variables, i.e. the time series Tn (Tn: stocastic variables,
tn: instances.)
One sufficient assumption is that all Tn share the same mean and variance.
Some "drift" is accepted as well as are "occasional steps" (put in my
own words).
When I look at my "Britannica example" and consider "sane mean and
variance", I do not feel comfortable with these assumptions.
> The first order predictor is the queue size at the entry to the link.
> That's a very reliable predictor of latency for the next event. But it
> provides very little input about variance (which depends entirely on
> packets arriving from elsewhere at "light speed").
>
> I think there might be a much better (i.e. less complex to state)
> approach in NOT trying to start with the link and go by induction to the
> multilink case. Instead, perhaps start with an end-to-end flow (over a
> path) and reason about what happens as you add flows that superpose
> themselves on the existing paths.
>
Is this really that more promising? Admittedly, this a rhetorical
question. Basically, this is already being done. So I sharpened it a
little bit by omitting n-1 links in the n link case ;-)
Or is it a matter of how coarse or fine grained we look at the problem?
I´m thinking about this problem - and at the same time, I use TCP and
everything seems just to be fine :-) Then I read Raj Jain´s paper about
the divergence of RTO estimators and Lixia Zhang´s paper on TCP timers.
And I understand that we already addressed a number, perhaps nearly all,
issues in these papers. But one issue which I do not yet understand is
the use of EWMA filters.
- Do they hold for arbitrary TCP connections? Can we reasonable assume
the necesseary conditions given by Edge? Or alternative ones?
- Do they converge fast enough in case of a sudden step in latency? Do
they follow drifting latencies? How must we set the gain? I sometimes
here something about "agility" and "stability". Basically, we should
minimize the forecast error by proper choice of the gain. Can we use the
same gain for all flows?
- Is the temporal resolution of an ACK clocked TCP flow sufficient to
provide reasonable estimates? Or is the time series´ resolution obtained
from that too coarse? (Nilsson, Martin and Rhee do so claim in there
paper on lateny change / congestion correlation in June, 2003. One
central point there was that the temporal resolution of observed round
trip times in most cases is by far too coarse to derive reasonable
conclusions concerning path properties.)
I get no "feeling" for this situation. I see lots of scenarios and
individual papers there, but I don´t see the big picture yet.
Detlef
--
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937
More information about the end2end-interest
mailing list