[e2e] Spurious Timeouts, Fact or Fake?
Detlef Bosau
detlef.bosau at web.de
Wed Aug 3 07:08:54 PDT 2011
During the recent past, this list has seen quite some few posts
regarding TCP RTT measurement.
Now, first of all, I was interested in how often RTT measurments shall
be made and how they can be made. A particular concern is Karn's Algorithm,
because to my understanding, the consequence of Karn's algorithm is that
RTT measurements obtained by a single RTT timer can be taken only when a
sender has no outstanding duplicate packets.
Perhaps, I'm wrong here.
However, from what I've read so far, it is not yet completely clear, how
often RTT measurements should be made. The alternatives discussed so
fare are:
- once round,
- each packet.
While the latter appears appealing to me, particularly when implemented
with time stamps (RFC 1323), which overcomes the problems discussed by
Karn & Partridge regarding the problem of packets being sent more than
once, some literature indicates problems with the SRTT estimator when
time stamps are in use.
Now, the whole discussion is somewhat confusing to me.
1.: Spurious Timeouts are confusing to me, because spurious timeouts
(i.e. a packet which is well successfully transmitted, however the ACK
does not reach the sender on time) are basically expected by Edges paper
and the literature based upon this. However, there are papers around,
which put the mere existence of spurious timeouts in question, e.g.
author = "Francesco Vacirca and Thomas Ziegler and Eduard Hasenleithner",
title="{TCP Spurious Timeout estimation in
an operational GPRS/UMTS network}",
month="May",
year="2005",
journal = "Forschungszentrum Telekommunikation Wien
Technical Report
FTW-TR-2005-008"
}
, while others give detailed recommendations how to deal with spurious
timeouts in practical implementations, e.g.
http://tools.ietf.org/search/draft-allman-rto-backoff-02
However, to me the problem seems closely coupled to the underlying
question whether or not we can estimate the expectation and variance of
the RTT in a TCP session. Edge requires the according stochastic process
to be weakly stationary. In other words: In a TCP session, once having
started and being run for some settling time, the observerd RTT shall
be, at least roughly, identically distributed.
This distribution should be subject to only very slow and very rare
change, if at all.
And accourding to RFC 2988, we can obtain SRTT and RTTVAR by RTT samples
using the well known EWMA estimators for this purpose.
So, my questions are:
1.: How often shall RTTM be made?
2.: Is it reasonable to assume "weakly stationary" RTTs as done by Edge?
3.: Are the EWMA filters from RFC 2988 satisfactory, particularly are
these sufficiently generic to yield reasonable results for an arbitrary
TCP session?
One could summarize these to the question: Do we obtain RTO in a
reasonable way? And when we talk about spurious timeouts, are we talking
about spurious timeouts - or are we talking about shortcomings of the
SRTT and RTTVAR estimators here?
I'm somewhat confused here at the moment. And I would appreciate any
enlightenment ;-)
Detlef
--
------------------------------------------------------------------
Detlef Bosau
Galileistraße 30
70565 Stuttgart Tel.: +49 711 5208031
mobile: +49 172 6819937
skype: detlef.bosau
ICQ: 566129673
detlef.bosau at web.de http://www.detlef-bosau.de
------------------------------------------------------------------
More information about the end2end-interest
mailing list