[e2e] local recovery or not local recovery, was: Re: Satellite networks latency and data corruption
Detlef Bosau
detlef.bosau at web.de
Tue Jul 5 05:58:38 PDT 2005
alok wrote:
> Hi,
>
> What I want to know is:
>
> (a)
> If the retransmission/ARQ is entirely offloaded to the end transmitter
> and receiver (say my PC and your PC if we are doing a peer to peer),
>
> versus
>
> (b)
> each transmitter and receiver pair on intermediate hop does the same,
This is the one millione dollar question.
O.k. First of all, the standard reference on this matter is:
Saltzer, Reed, Clark: End-To-End Arguments in System Design. ACM
Transactions on Computer Systems 2(4), Nov. 1984, pp. 277-288.
(Hopefully the reference is correct. But I think, title and authors are.)
However, "commonly accepted general truths" are similar to the bible.
Each believer tell´s you, the bible is true. Ask two believers - you
will hear three truths ;-)
Basically, your question meets _eaxctly_ the point of local recovery in
mobile wireless networks (you guessed correctly, it´s me again ;-)),
however I´m not sure whether the resulting design decision is the same.
Some helpful criteria might be:
1.: User perspective: What is the _goodput_ from the user´s point of view?
2.: Fairness perspective: Does a user unduely waste network ressources?
3.: System perspecive: Where could error recovery be done cheapest?
Let´s start with 3. Where could error recovery be done cheapest.
Let me give to propositions, and please correct me here because I do not
know the precicese values here. But as fas as I have in mind, on
backbone routers / switching systems we have
- about 30000 (3^e4) active TCP flows per 100 MBits/s capacity,
- about 100 ns available processing time for a packet on a router. (A
colleague told me about this some years ago. I personally think this is
rather dated. I personally would expect 10 ns or even 1 ns)
For these reasons, the general wisdom is to put complexity on the end
systems if possible.
This is perhaps a possible problem for CETEN, where a correct
implementation requests floating point calculation for each IP datagram.
(Perhaps, one can improve the algorithm in this respect.)
To illustrate the importance of this matter, please consider the IPv6
header: Because there was no compelling reason for spending this
processing effort, one has _left_ _out_ the header checksum!
For 3G networks, my position is that the gateways between Internet and
mobile network are typically quite large computer systems, each one
serving some few hundreds of flows. In this case, the effort is acceptable.
In satellite networks: I don´t know. Particularly the state variables
for ARQ in high bandwidth systems may turn out inacceptable high.
2.: Fairness:
If ARQ is placed on the end system, the whole network path "enjoys"
necessary retransmissions. Particularly, when a packet must be sent 100
times or more to be successfuly received ad least once, it may increase
the network performance to plate ARQ on intermediate sywtems.
Once again on 3G networks: Typically, 3G networks are only used as
access line. So the major part of the path typically resides in the
wirebound internet. Therefore, it makes sense not to bother ther
internet with retransmissions. Even more, ARQ in 3G networks is done on
radio block level, which is more efficient than ARQ on pakcet level.
However, in satellite networks, I can imagine that the bottleneck is
really the satellite link itself. In that case, it would make only a
minor difference, if ARQ is placed on IS or ES.
3.: User perspective:
How long does it take for a packet to be delivered?
Again: On a 3G network, the major transission time is spent on the
Intentet, in case a _RAW_ channel _WITHOUT_ ARQ/RLP is used.
Let´s consider a latency 50 ms and 100 transmissions, than a user will
see 5 s STT latency for a packet.
When the same packet could be sucessfully delivered via RLP and STT
would be increased by 100 ms for that reason, STT would be 150 ms. This
is less than 5 s, and this is preferable to the user.
Satellite networks: Here the major time is spent on the satellite link.
In summary, I´m not quite sure but I can imagine that in satellite
networks error recovery is left to the end systems. I think the error
recovery effort for IS can turn out unduly high with not that much
benefit for fairness and user.
Basically, high costs (1.) are an argument for (a), utilization and good
user performance (2., 3.) are an argument for (b).
It is a tradeoff.
>
> How is (a) different from (b) in terms of effective utilization?
> Obviously it is true if an end point A is talking to B and C :
This is mainly covered by 2. Fairness.
Of couse, the utilization of a link decreases if it is fed up with
retransmissions only.
I think, the consideration can turn out quite different, depending on
the actual scenario: E.g. a satellite mobile phone could be attached to
the Internnet. Or a satellite link could be used for Internet backbone
connections, perhaps wheather dependent in combination with a fibre link.
As you see, I cannot offer a real "answer" here. My intention is to draw
attention to the question.
I´ve got the impression that there are typically strong objections
against doint local recovery in the TCP community. Althouth RLP is
practically in use for more about a decade now in mobile networks, I
freuqently see the position that TCP should be run on faw e2e networks
without any local recovery support.
Perhaps, this impression is wrong. However: I think the decision is not
easy to make.
DB
--
Detlef Bosau
Galileistrasse 30
70565 Stuttgart
Mail: detlef.bosau at web.de
Web: http://www.detlef-bosau.de
Mobile: +49 172 681 9937
More information about the end2end-interest
mailing list