[e2e] Open the floodgate
David P. Reed
dpreed at reed.com
Thu Apr 22 06:26:50 PDT 2004
Alex - your note reflects a tremendous misunderstanding of TCP. Is TCP
supposed to correct for failing hardware anywhere on the path? The answer
is no. TCP is a protocol that provides end-to-end error control - which
ensures error-freeness over best efforts networks.
Does TCP obviate the need for local error control, because it does it on an
end-to-end basis? No - it was never supposed to.
The end-to-end analysis applies here:
1. end-to-end reliability cannot be provided at the link level. Thus we
must provide it on an end-to-end basis.
2. there is vast improvement in the operating point that can be achieved by
doing local error recovery at the link (or within AS) level - local error
recovery allows for tighter control loops, and should be done without
adding to the end-to-end delay. Thus it is appropriate to optimize
(improve) the performance of links by retransmission.
The second point also captures what Bob Kahn crystalized in creating the
Internet - the concept of "best efforts". The word "best" clearly does
not mean "no effort". What it means is a subset of the end-to-end
argument - do what you can where what you do is unambiguously helpful, but
don't take on the impossible burden of assuring high-level properties with
low-level mechanisms.
The worst botch I have ever seen in my consulting to commercial network
installations was a Fortune 500 company that really misunderstood
this. They had been convinced to put in frame relay links between all
their sites, and to use frame relay's "perfect end-to-end" delivery mode
between their locations. That's not a "best efforts" link if you think
about it - it's a stranded soldier maintaining fanatical adherence to duty
20 years after the war is over.
What happened? If any link downstream failed (turned off), the frame
relay link started filling buffers in every underlying switch. It took
many seconds to fill up, then when the downstream link came back up, it
dumped many seconds worth of completely useless traffic into the destinations.
The frame-relay sales engineer just could not understand why turning off
his low-level reliability made his customer happier. In fact, he kept
trying to get them to turn it back on - saying that the problem must have
been with the routers.
Ultimately, this is the all-too-human problem of perseverating based on an
incorrect theory of the world. There's nothing wrong with theories, but
their utility depends on matching their assumptions to reality. The
reality of the Internet is not the reality of traditional control theory.
Control theory
- in the presence of competing and evolving goals at the user level (no
single objective function to maximize, but instead a need to develop the
most flexibility - that is the most diverse set of stable operating points
in control theoretic terms) and
-in the presence of highly coupled interactions with the clients (the WWW
invented caching, which changed the operating point in a completely
unpredictable way, without consulting the network planners) and
-in the presence of an evolving set of underlying communications technologies
is now a new science. This is partly because of people like John Doyle and
Sally Floyd who took on the challenge of constructing a new control theory
to match the requirements of the Internet. Yes, everyone involved in
developing TCP knows control theory. But few of them have the illusion
that the world exists to fit that theory.
More information about the end2end-interest
mailing list