[e2e] Agility of RTO Estimates, stability, vulneratibilites
Sireen Habib Malik
s.malik at tuhh.de
Tue Jul 26 06:58:16 PDT 2005
Hi,
A technical discussion on heavy-tailed distributions is perhaps not
relevant to this list, however, one gets an impression that these
distributions are not relevant/suitable from Internet's point of view.
> "Note that the load distribution cannot be characterized by a stable a
> priori description, because load is itself responsive at all
> timescales to behavior of humans (users, app designers, cable plant
> investors, pricing specialists, arbitrage experts, criminal hackers,
> terrorists, network installers, e-commerce sites,
> etc.)".............................and......."you are fooling yourself
> if you start with a simple a priori model, even if that model passes
> so-called "peer review" (also called a folie a deux - mutually
> reinforcing hallucinations about reality) and becomes the common
> problem statement for a generation of graduate students doing network
> theory. In my era, the theorists all assumed that Poisson arrival
> processes were sufficient. These days, "heavy tails" are assumed to
> be correct. Beware - there's much truth and value, but also a deep
> and profound lie, in such assertions and conventional wisdoms. "
From a network's point of view, the users (of all kinds) generate data
- let us call it their on-phase. After downloading/generating data, they
go into thinking- or reading-phase. This is the off-phase. The users
remains in the on- and off-phase for randomly distributed times.
Each user cycles through this On-Off behavior. This is the starting
point of atleast one way of modeling Internet traffic.
Poisson arrival process assumption at the session level is still ok, but
the data, or files, these arrivals cause to flow through the net, are
heavy-tailed distributed. This assumption is correct because empirical
studies have showed us that - time and again.
There is this proof that says that if either, or both, the on- and
off-times of the on-off source are heavy-tailed distributed then the
resultant traffic is LRD in nature. Simply put, LRD is tied to the
large/infinite variance of the heavy-tailed distributions.
Now the situation gets more complicated when we consider that the
packets generation process in the heavy-tailed on-phase is not
Poissonian, rather controlled by TCP. The protocol introduces additional
burstiness in the small-time scales. This is also known as Multifractality.
Therefore, the two most significant factors from Internet traffic's
point of view are the heavy-tailed distributed file-sizes and congestion
control mechanism of TCP.
Please note, even if the variance in real world is not infinite, and
that the LRD is only visible for some orders of time-scale, the queue
performance is still significantly different from the one based on the
simple assumption of Poissonian renewal arrival process (of packets on
the line).
Side note: 90% Internet traffic is based on TCP. The small-flow model
holds for the web-traffic, the long-flows model is relevant to the P2P
downloads. Traffic measurements show that P2P traffic now makes almost
50% (or perhaps more) of the TCP traffic. See Sprint website for traffic
traces and analysis.
> Those of you who understand the profound difference between Bayesian
> and Classical statistical inference will understand ...
!!!
Sireen Malik
Hamburg University of Technology, Germany
More information about the end2end-interest
mailing list