[e2e] ECN, RED, dropping packets, etc
Simon Leinen
simon at limmat.switch.ch
Tue Apr 27 16:47:52 PDT 2004
RJ Atkinson writes:
> On Apr 27, 2004, at 10:05, David P. Reed wrote:
>> [...] If the router guys would just stop trying to "help" by
>> squirrelling packets away in buffers until the burden becomes
>> intolerable, and instead drop early, set ECN where possible,
>> etc. they would probably find that from their worm's eye view of
>> the world, it just gets better and better.
>>
>> To the extent that endpoint implementers are lazy, any stacks that
>> still don't support ECN should be motivated by smoother and higher
>> performance.
> David,
> Generally speaking, the router implementers did their job a few
> years back. Most routers (both the largish ones and the relatively
> lower cost "layer-3 switch" variety) support Random Early Drop, and
> other similar techniques -- and have for a few years now. Certainly
> we do. However, our experience is that most operators do not enable
> those features (unclear to me why, possible just conservative in
> what they deploy, others here might be better informed as to why).
Ran,
your description of the situation sounds correct (except that pedants
might say that RED actually stands for Random Early *Detection* :-).
Many/most routers have some form of RED, with varying degrees of
tunability, sometimes per-class. I don't know how many high-speed
routers support ECN, and have certainly never seen it in a data sheet.
Cisco does have a RED implementation that supports ECN, but I don't
think it works on high-speed routers like the 12000 GSR or the
Catalyst 6500/7600 OSR (where it would have to be done "in hardware").
And yes, most operators don't enable RED. As you suspect,
conservatism is part of this. If router vendors don't enable RED by
default, there must be a reason... (if there are routers where RED or
some other active queue management scheme is on by default, I'd like
to hear about it!).
It is considered as hard to configure the original RED "right" for a
given situation - this is probably one reason why router vendors don't
make it the default. There are newer variants of active queue
management (Gentle RED, REM, Blue, etc.) that claim to outperform the
original RED, easier to tune or not to need any tuning at all, solve
world hunger etc. So some operators may think they should wait for
the research community to converge on the perfect AQM scheme, and for
router vendors to implement it.
I'm an operator. I love RED. In the bad old days, we used to
provision our EXPENSIVE transatlantic links according to the following
rule of thumb: An upgrade (additional E1 link) was scheduled so that
when it arrived, the loss rate during peak hours would be around 4-5%
- this was what we still considered tolerable for our users. The load
graphs for these links were flat at 100% from 9AM to 8PM or so. On
these links, turning on RED certainly had a very beneficial effect,
bringing the transatlantic RTTs down from 150-200 ms to 120 ms (with
some tuning), with no perceptible change in link utilization.
Unfortunately I was so happy with the results that I never switched
back to tail-drop in order to do a proper comparison... did I mention
that I had forgotten to do methodically sound measurements of how much
our network had sucked before? I guess this is why these kinds of
success stories don't tend to get published (with the exception of
Sean Doran's 2 Mb/s access line utilization graphs before and after
RED), while there are of course many papers that tell you how RED is
hard to control and can perform worse than tail-drop.
In today's generation of our backbone, we thankfully don't have
congested links anymore, so we never enabled RED on our new routers.
But I'm sure that RED would be useful for millions of "broadband"
Internet users who fill their access link with bulk traffic, but would
like to run interactive applications at the same time. It is in this
space, if anywhere, that I could imagine active queue management
starting to become "on by default".
> And, by the way, most "core WAN backbone routers" (e.g. the kit that
> Juniper make) do tend to have (trans-Pacific RTT * interface speed)
> milliseconds of buffer on each interface. That buffer is *designed*
> to operate empty -- so that it can handle transients without
> dropping any packets inside the WAN core. I have not myself seen
> any data indicating that such packet buffer is operating in any
> other way in practice. Measured data from some WAN operator's core
> routers would likely help focus the discussion on that tangent.
> [A quick note on terminology might be in order here. Most
> "enterprise core routers" only have {Gig, 10 Gig} Ethernet
> interfaces are typically built out of what marketing folks call a
> "layer-3 switch". These latter are very different beasts than a
> "core WAN backbone router". These "layer-3" switches are generally
> thin on per-interface packet buffering (for cost reasons; enterprise
> end users won't pay the extra cost to get deep buffering on each
> interface) and many of them (though not all, sigh) are designed to
> have non-blocking switch fabrics and I/O configurations.]
Right, we use these kinds of "glorified campus switch" boxes for our
backbone - they are much cheaper than "real" core routers, especially
if you can run over GbE/10GE (LAN PHY!). Our bet is that in this way,
we can run all links at low enough utilization so that the buffer
limitations never matter. And should those buffers overflow one day,
that will be a good opportunity to try RED on these boxes...
> I'm a bit out of touch with the latest facts on most widely deployed
> host TCP stack, so I won't comment on the state of end systems' TCP
> implementation in this note.
Linux 2.4 and later(?) supports ECN. It used to be on by default, but
in later kernel releases I think the default changed to "off" because
of issues with evil middleboxes that blackhole ECN (because some
firewall or loadbalancer hates new TCP options, see the "ECN Hall of
Shame" at http://urchin.earth.li/ecn/).
Solaris 9 supports ECN, defaults to off ("passive", actually).
There is a suggested modification (RFC3168) to make ECN
blackhole-resistant, but I'm pretty sure both Linux and Solaris 9
still implement the earlier RFC2481 which doesn't handle blackholes.
For other OSes, there's some information on
http://www.icir.org/floyd/ecn.html - no mention of Windows or MacOS
though.
--
Simon.
More information about the end2end-interest
mailing list