[e2e] Question the other way round:

Fri Nov 22 01:06:24 PST 2013

actually, tcp does precisely that (in the absence of smart virtual queues +
ecn) and the deployment of DCTCP in data centers (which puts ECN and
virtual quees, and modifies the congestion window evolution of TCP) is
precisely because incast (and other problems in big data center
computations) are not rare events, but are caused by applications' traffic
patterns almost pathologically...

yes, we would like to design networks to make such things rare and a lot of
the topology and capacity planning in networks tries to do this, but a) the
evolution of applications is faster than even the best laid(*)
update/upgrade plans of the best data center and intranet planners (and
that was the POINT of making the internet an open platform for fast
innovation) and b) the internet-at-large is not planned - its an evolved
thing of a wibbly wobbly organic kind (cue 50th anniversary dr who music)

having spent the last couple of years staring at a few real-world data
center traffic traces, I wonder if they aren't also unplanned (slightly
pregnant pause)....after all "azure" == blue sky right ? :-)

cheers
jon
to paraphrase spike milligna, "a plan so cunningly laid that no matter
where you stood, it got under your feet"

On Thu, Nov 21, 2013 at 4:21 PM, <dpreed at reed.com> wrote:

> (please forward, Joe, if this is OK)
>
>
>
> We don't actually cause congestion to discover the rate, Jon.  Typically,
> we try to build networks that have adequate capacity (factors of 10 or 100
> are needed for things like the "Mother's Day" effect, or 9/11-scale
> community need to spread and filter news quickly.)
>
>
>
> We encounter congestion rarely - and we fix it by building in "factors of
> safety" in every portion of an underlying network.
>
>
>
> Only Ph.D. theses spend an enormous amount of effort on the totally
> congested "corner cases".  It's like a little puzzle that is easy to state,
> easy to solve, and makes the solver work hard.  It's kind of like a "rite
> of passage", so that is good, I guess.
>
>
>
> But if you are building a datacenter (AWS) or an access network or a
> transport network, you build for the worst case, and expect it to happen
> rarely.  The systems that depend on the network to actually work for
> people's needs never want a congested network, and don't actually want the
> network to operate at its local minimum cost/bit/sec.  They want the
> network to never be in the way, and the cost they really care about is the
> cost of getting congested for the wrong reasons.
>
>
>
>
>
>
>
>
>
> On Thursday, November 21, 2013 2:29am, "Jon Crowcroft" <
> Jon.Crowcroft at cl.cam.ac.uk> said:
>
>  > i think we're mixing up two discussions here
> >
> > 1. congestion was the original cause of the cwnd mech in tcp, BUT the
> > rate adaption using feedback as a way to distributed resource
> > allocation is the solution of the optimisation problem of net + user
> > addressed by several researchers (kelly/voice et al, also folks at
> > caltech) - these aren't the same thing - they got conflated in
> > protocols in practice because we couldn't get ECN out there completely
> > (yet) - ECN (when implemented with some decent queue (see 3 below) can
> > be part of an efficient decentralised rate allocation
> >
> > congestion is bad - avoiding it is good
> >
> > distributed rate allocation for flows
> > that have increasing utility for higher rate transfer
> > is also good (actually its betterer:)
> >
> > 2. for flows that have an a priori known rate, distributed rate
> > allocation is a daft idea, a priori - so some sort of admission
> > control for the flow seems better (but you can do probe/measurement
> > based admission control if you like, and are allergic to complex
> > signaling protocols)
> >
> > 3. orthogonal to both 1&2 is policing and fairness - flow state means you
> > can do somewhat better in fairness for 1 (e.g. do fair queus, a la
> > keshav), and a lot better for policing for 2...
> >
> > but then we've been round the best effort, integrated service,
> > differentated service, core stateless fair queue, probe based
> > admission control, ecn, pcn loop about 6 times since this list
> > existed:)
> >
> > yes, to detlef's original point, causing congestion (and buffer
> > overrun) to find out the rate is a bit of a sad story...
> >
> > In missive <528CFE15.7070808 at isi.edu>, Joe Touch typed:
> >
> > >>
> > >>
> > >>On 11/19/2013 7:14 PM, Ted Faber wrote:
> > >>> On 11/19/2013 10:15, Joe Touch wrote:
> > >>>> On 11/19/2013 10:09 AM, Dave Crocker wrote:
> > >>>>> Given the complete generality of the question that was
> > asked, is there
> > >>>>> something fundamentally deficient in the answer in:
> > >>>>>
> > >>>>>
> > http://en.wikipedia.org/wiki/Congestion_control#Congestion_control
> > >>>>>
> > >>>>> ?
> > >>>>>
> > >>>>> In particular, I think it's opening sentence is quite
> > reasonable.
> > >>>>
> > >>>> I agree, but it jumps in assuming packets. Given packets, it's
> > easy to
> > >>>> assume that oversubscription is the natural consequence of
> > avoiding
> > >>>> congestion.
> > >>>
> > >>> Unless someone's edited it, you should read the first sentence
> > again. I
> > >>> see:
> > >>>
> > >>>> Congestion control concerns controlling traffic entry into a
> > >>>> telecommunications network, so as to avoid congestive collapse
> > by
> > >>>> attempting to avoid oversubscription of any of the processing or
> > link
> > >>>> capabilities of the intermediate nodes and networks and taking
> > resource
> > >>>> reducing steps, such as reducing the rate of sending packets.
> > >>>
> > >>> I read the reference to packets as an example.
> > >>
> > >>Me too.
> > >>
> > >>But circuits don't have a collapse or oversubscription. They simply
> > >>reject calls that aren't compatible with available capacity.
> > >>
> > >>I'm not disagreeing with the definition; I'm disagreeing with the
> > >>assumption that having a network implies congestion and thus the need
> > >>for congestion control.
> > >>
> > >>There are a variety of mechanisms that avoid congestion, typically by
> > >>a-priori reservation (circuits), or by limiting resource use implicitly
> > >>(e.g., ischemic control). These are a kind of proactive control that
> > >>avoid congestion in the first place.
> > >>
> > >>That's not to say whether these mechanisms are scalable or efficient
> > >>compared to the resource sharing afforded by packet multiplexing.
> > >>
> > >>Joe
> >
> > cheers
> >
> > jon
> >
> >
>