[e2e] Some thoughts on WLAN etc., was: Re: RES: Why Buffering?

Mon Jul 6 11:54:46 PDT 2009

David P. Reed wrote:
> I think you are confusing link troubles with "network" troubles.  (and 
> WLAN's are just multi-user links, pretty much).
No way. Did I use the word "trouble" even once? (No kidding.)

> Part of the architecture of some link layers is a "feature" that is 
> designed to declare the link "down" by some kind of measure.

O.k., than this feature is, at least in some sense, a "declaration".

>   Now this is clearly a compromise - the link in most cases is only 
> temporarily "down", depending on the tolerance of end-to-end apps for 
> delay.

For delay and, of course, for reliability. However, that's a minor point.

At least the term "temporarily" is quite sensitive here. When a link is 
"temporarily down", one would expect the link to be "up again" at some 
time. Particularly in the literature on opportunistic scheduling you 
always will find the attitude (if unspoken): "When the link is bad - 
we'll send it later, when the link is better." _Will_ a bad link ever be 
better?

I'm not thinking of troubles. Actually, this question reflects my 
personal life experience. How often did I waste years of my life - just 
waiting, for something to become better. (Perhaps, this kind of thinking 
is some kind of midlife crisis.... May be. But if the consequence is to 
stop waiting and to seize the day instead - than this is the right 
consequence.)

When there is evidence, that bad link conditions will be overcome in 
some realistic period of time, it makes sense to defer sending. No 
discussion about that. But what should be done in cases where such an 
evidence doesn't exist?

When a wireline link is "down", this may mean different things:

1. The cabling is broken. No discussion, the cabling must be fixed and 
if possible, an alternative routing should be chosen until the problem 
is fixed.

2. Power supply, Router, Switch, Hub or any other malware made of steel, 
copper and plastic is broken. Same discussion as 1.

3. Referring to the sense of "temporarily down": Routing protocols may 
assess load metrics or other cost metrics - and may declare a link 
"down", i.e. not feasible, when the load becomes to high or the link 
becomes to expensive.

However, I would ask the same question here: Is there any evidence, that 
the load / costs /... will drop down some time?

Of course, these are two different scenarios. i) "Permanently down until 
some thechnical problem is fixed." ii) "Temporarily down, until the link 
becomes feasible again." Both in common, I made the quiet assumption, 
that an alternative route exist. Which may not be the case e.g. in WWWAN 
scenarios.

> In an 802.11* LAN (using standard WiFi MAC protocols), there is one of 
> these"declared down", whether you are using APs or Virtual LANs or 
> AdHoc mode (so called) or even 802.11s mesh.    Since 802.11 doesn't 
> take input from the IETF, it has no notion of meeting the needs of 
> end-to-end protocols for a *useful* declaration.  Instead, by studying 
> their navels, the 802.11 guys wait a really long (and therefore 
> relatively useless in terms of semantics) before declaring a WLAN 
> "link" down.  Of course that is a "win" if your goal is just managing 
> the link layer.

Hm. One thing, I learned from my CS colleagues is that in CS we have a 
quite strong "top down view", i.e. it would be helpful to respect the 
application layers needs for a "link down declaration". And these may 
well depend on the scenario.
I well remember an RFC draft submitted by Martina Zitterbart et al. some 
years ago which proposed a "limited effort class" in the context of QoS 
and DiffServ. Just to mention an example that users don't necessarily 
require a high throughput link. So, in my opinion, a "useful 
declaration" for "link declared down" should respect the user's 
requirements.

Perhaps it would even be possible  to use a wireless link for those 
applications, for which the minimum requirements for a link are met and 
not to use the same link for other applications.

>
> What would be useful to the end-to-end protocol is a meaningful 
> assessment of the likelihood that a packet will be deliverable over 
> that link as a function of time as it goes into the future.   

Indeed. However, I'm not quite sure if this is possible. I'm looking for 
something like this for some years now, however the more I understand 
wireless networks, the less optimistic I become in this respect.
> This would let the end-to-end protocol decide whether to tear down the 
> TCP circuit and inform the app, or just wait, if the app is not delay 
> sensitive in the time frame of interest.
>

Of course. However, this is one example for common principle "prediction 
is hard, especially of the future". (Credited to numerous people.)
> Unfortunately, TCP's default is typically 30 seconds long - far too 
> long for a typical interactive app.  

In some GPRS standard, even latencies of 600 seconds were accepted. This 
may be quite long for quite a few apps.

However: 30 seconds, even 0.30 secodns, are "ages" in the context of 
wireless networks. I don't know any reasonable CE guy who would make a 
forecast for a wireless channel's quality, or a packet corruption ratio, 
for 30 seconds, despite of lab scenarios.
> And in some ways that's right: an app can implement a shorter-term "is 
> the link alive" by merely using an app layer handshake at a relevant 
> rate, and declaring the e2e circuit down if too many handshakes are 
> not delivered.  If you think about it, this is probably optimal, 
> because otherwise the end-to-end app will have to have a language to 
> express its desire to every possible link along the way, and also to 
> the "rerouting" algorithms that might preserve end-to-end connectivity 
> by "routing around" the slow or intermittent link.
>
> Recognize the "end to end argument" in that last paragraph?   It says: 
> we can't put the function of determining "app layer circuit down" into 
> the different kinds of elements that make up the Internet links.  

Absolutely.

However, what I think about is: If there _is_ no alternative path, so 
rerouting will not solve the problem, how can we make an application 
_live_ with the situation?

Sometimes, we get the advice: "Love it, change it or leave it." Now, 
wireless channels may only offer the possibility to love them, because 
we cannot change them and because there is no alternative, we cannot 
leave them. So, the alternative is only: "take it - or leave it."

Basically, that is in short what we talked about before: You mentioned 
that IEEE 802.11 does not take input from IETF. So,  wireless networks 
will perhaps not obey QoS requirements or the like from upper layers. CE 
guys tell me, that a forecast for wireless channel properties is more 
than difficult and perhaps simply not possible - hence application 
adaptation may stay a dream.
> Therefore we need to do an end-to-end link down determination.
And ideally some determination when a link will be up again....

> Thus the network shouldn't spend its time holding onto packets in 
> buffers.  Instead it should push the problem to the endpoints as 
> quickly as possible.  Unfortunately, the link layer designers, whether 
> of DOCSIS modems or 802.11 stacks, have it in their heads that 
> reliable delivery is more important than the cost to endpoints of deep 
> buffering.  DOCSIS 2 modems have multiple seconds of buffer, and many 
> WLANs will retransmit a packet up to 255 times before giving up!

I well remember your advice: "Not more than three attempts." And I 
think, I'm with you here, because I think that the overlong latencies 
accepted by some standards are a consequence of a far too low packet 
corruption probability.
With respect to the GPRS standard mentioned above, the overlong 
tolerated latencies appear in conjunction with a packet corruption ratio 
10^-9, which is simply nonsense for many wireless links.

And that's one of the arguments I often have with colleagues, that one 
does not believe me that wireless links are typically _lossy_. And when 
wireless links appear to be _not_ _lossy_, this is often a consequence 
of a far too high number of retransmissions.

Hence, a small number of retransmissions (maximum 3) would result 
exactly in what you propose: On a noisy link, packets  will see a high 
corruption ratio and hence, the end-to-end application sees an important 
packet loss and has to deal with this.

And perhaps that's not so far from my way of thinking. I simply don't 
want to declare a link "down" because of a high packet corruption ratio. 
I would like to use the link with the packet corruption ratio it can 
yield - and then, it's up to the application with this.

>   These are not a useful operational platform for TCP.  It's not TCP 
> that's broken, but the attempt to maximize link capacity, rather than 
> letting routers and endpoints work to fix the problem at a higher level.
>
I did not say that TCP is broken. However, I think some algorithms in 
TCP could be made more robust against lossy channels in some scenarios.

Nevertheless, TCP is neither the Torah nor the Holy Bible nor the Quran. 
Perhaps, for some scenarios there may exist reasonable alternatives to 
TCP for reliable packet transfer.

Detlef

-- 
Detlef Bosau		Galileistraße 30	70565 Stuttgart
phone: +49 711 5208031	mobile: +49 172 6819937	skype: detlef.bosau	
ICQ: 566129673		http://detlef.bosau@web.de