Interesting People mailing list archives

Re: the undead urban myth of the LOC/EID split NOT AN EASY READ


From: David Farber <dave () farber net>
Date: Mon, 3 Nov 2008 19:03:35 -0500



Begin forwarded message:

From: Richard Bennett <richard () bennett com>
Date: November 3, 2008 6:33:21 PM EST
To: dave () farber net
Subject: Re: [IP] the undead urban myth of the LOC/EID split NOT AN EASY READ

Dave -

Feel free to share this with IP if you wish.

I read John's book this weekend, in electronic form from the Santa Clara Country Library in Silicon Valley. Having read most of the books ever written on the Internet, both of the technical variety and the public policy primers, and having been involved in protocol standards from the 1980s to the present, I feel I can say with reasonable confidence that "Patterns in Network Architecture" is the most important book on network protocols in general and the Internet in particular ever written. As the passage below indicates, it's not easy going for the non-technical crowd, who will certainly find much of the discussion excessively detailed. But John places the protocols in their proper socio-historical context for the first time. Readers, even the uninitiated, should take away from the book an appreciation for the fact that network architecture is as much a political exercise as a technical one, and always has been.

At a time when public policy makers are literally inundated with opinion about the Internet's design and social implications, it's important to peel away the metaphors and analogies and take a look at how it really works, what it does, what it doesn't do, what it could do a lot better, and how it got the way it is. John Day blazes a trail to that kind of understanding. It's an excellent book, even though I may disagree with some of his analysis of the Early Wittgenstein and a few other things.

Regarding the discussion below, it may be easier to follow if we take the example of multi-homing or mobility and trace it through IP address assignment, path discovery, and transit, contrasting what we'd like to see with what we do see. In the present incarnation, we see the problem begins with IP address assignment to a MAC address, continues with DNS pointing to a location, continues with BGP advertising a route to a location, and ends with some sort of re- direction. That's IP. In XNS, the process is a bit different, and that difference highlights the problem with IPv4 that is only exacerbated in IPv6.

RB

David Farber wrote:


Begin forwarded message:

From: John Day <jeanjour () comcast net>
Date: November 3, 2008 10:13:04 AM EST
To: David Farber <dave () farber net>, Jonathan Smith <jms () cis upenn edu>
Cc: day () bu edu, David Meyer <dmm () 1-4-5 net>
Subject: Re: [IP] Re:   the undead urban myth of the LOC/EID split

Possibly for the IP list. O'Dell thinks this is too much an "inside" account for the list. I will let you be the judge. It is not an easy topic. There is no simple explanation, especially between loc/id split and POA/node.

Would appreciate your thoughts.

John

Let me try to explain the addressing problem. I thought all of this was common knowledge at least among the old timers.

We first realized we had a problem with naming and addressing in the ARPANET in 1972, when Tinker AFB joined the Net. They wanted redundant IMP connections. I remember Grossman coming in one morning and telling me this. My first thought was, "Right, good idea!", and 2 seconds later, "O, *&@##, that isn't going to work!"

Host addresses were IMP port numbers, so with 2 interfaces on 2 different IMPs, Tinker would look like 2 hosts to the network, not one. Tinker's host would know it had two connections, but the network would think it was one connection to two different hosts. This is, of course, the multihoming problem.

Had we blown it? No, there were a lot of things we didn't do in that first attempt! We had a lot more important problems on our plate. In those days, just moving data between very different computers was a major accomplishment. We knew the naming stuff was hard and this was an experiment. We could deal with that later. yea, yea, I know. Famous last words! ;-)

But the answer was obvious. We were all OS guys. We had seen this problem before. We needed a logical address space over the physical address space. And we also knew that we need application names as well. Just as OSs require a 3 levels of names, networks would too. This well-known socket business we had done was just a kludge so we could demonstrate first 3 applications we had up and running. Multihoming was a symptom of a much more fundamental missing piece of the overall design. But we would get to it sooner or later. (right, more famous last words.)

It didn't seem like a big deal. Certainly not enough to bother writing a paper on it. For some reason, it took 10 years before Jerry Saltzer wrote it up and published it, later circulated as RFC 1498. Jerry got it right except for one little piece, which hadn't happened yet. He describes, three levels of names for different things at different layers in a network architecture.

Application names, which are location independent.
Node addresses, which are location dependent
Point of attachment addresses, (POA) which may or may not be location dependent and
mappings between them.

In general, the scope of the layers increases as you go up.

(Draw a picture or see the figures in my book. It will be easier to visualize what is coming. Don't label the layers, we don't care what they are called.)

We have called the function that maps between Application names and node addresses, a directory function. (Not to be confused with X. 500. The terminology was in use a decade or more before that.)

The mapping of node to POA is generally part of routing. In this scheme routes are sequences of node addresses. This we had understood since 1972. I say "we" meaning people I worked around. Clearly not everyone did. This is what you get for assuming it is obvious. ;-) (BTW, for the curmudgeons I am not claiming I came up with this before Saltzer. Quite the opposite, I am claiming that several of us saw the broad outlines of what was needed. It took Saltzer to make it concrete. Although I wish he had been a little more concrete about what a POA and node address were.)

So the problem with the ARPANET/Internet is that we name the Point of Attachment (twice), but nothing else. Why twice? The MAC address does the same thing. They both name the interface between the wire and the system. Until CIDR it was no harder to route on MAC addresses as IP addresses, since they weren't addresses anyway, i.e. they weren't location-dependent. While we have something that is sort of an application name in URLs, it isn't really. There is too much "path" in a URL to be an application name. (More on this later.)

Around this time, we learned a few other things about the problem:

1) Addresses only had to be unambiguous within the scope of the layer in which they were used. 2) Naming the host was irrelevant to the naming and addressing problem as far as communications was concerned. A host name might be useful for network management problems but it was merely coincidental to the communications problem. For communications, one is at least naming the protocol state machine. Thinking of it as a host name implied constraints that would only get in the way. 3) Embedding a lower layer address in a higher layer address made it route dependent, which is what we needed to avoid (see below).

Many of us had always known that the ARPANET/Internet was incomplete. We didn't fix it with IPv4 because (I think) we felt that we didn't really have enough understanding of the whole naming and addressing problem yet (this was 1976 or so) and we didn't want to fix it the wrong way. Any way this was still mostly an experimental network. It wasn't meant to be in production. We could do that later.

This is why, starting around 1980 the small group in OSI who was doing connectionless insisted the network layer would name the node. It wasn't a phone company thing (clearly not!!), it was fixing something from the early ARPANet, that we had not had an opportunity to fix yet. Mostly it was Internet people who understood and pushed it in OSI, not the Europeans. Several European positions wanted OSI to have well-known sockets and name the interface. I made sure it didn't creep into the Reference Model and Lyman, Oran, Piscatello, etc. made sure it wasn't in the protocol.

This, of course, was all thrown out the window by the IPng process, which insisted that we go ahead with half a naming and addressing architecture. (At the time, I don't think there were 2 dozen people in the IETF who understood naming and addressing. The failure of a University education.) I have never understood the IETF's reaction to these things. Rather than "you blew it let us show you how to do that right," Their reaction has been if They did it, we won't, even if it means cutting off your nose to spite your face. The sociologists will probably explain it to us some day.

Once it was decided that the IPng would name the interface, we were pretty well stuck. On the road to where we are today. Not to put words in O'Dell's mouth, but I always thought 8+8 was an attempt at some sort of fix, even if it was a kludge, given that they wouldn't do it right and perhaps later we could move that closer to right. However, they wouldn't even do 8+8.

The early drafts of the OSI Model also made the error of building the (N)-address from the (N-1)-address, like embedding MAC addresses in v6. (This is one of those things that looks obvious on the surface and when you get into it, you realize is just plain wrong, a bit like Aristotelian physics: Seems like common sense until you test it.) We uncovered that problem around 82 doing the Naming and Addressing Addendum to the RM and fixed it. Why this is a problem in networks and not in OSs is also in the book. Suffice it to say here that this makes the address into a *pathname* through the stack. Path dependent just at the point it shouldn't be. Makes it into naming the interface even if you thought you weren't. (Now some of you will say, but I don't have to interpret it that way. It still will name the node. Correct. If *everyone* obeys the rules. But some hot-shot is going to assume he knows better and then complain like hell when his thing doesn't work somewhere. Best way to keep them honest is not let them be dishonest.)

The one thing you don't want in a network. It works in an OS because there is only one way to get anywhere. But in a network (even in a network stack) there may be more than one way to get some where. So addresses in different layers have to be completely independent to preserve path independence. Which brings us to the piece that was missing in Saltzer's analysis:

The missing piece that hadn't happened when Saltzer wrote was multi- path routing: More than one path to the next hop. This turns out to be one of those little things that opens up considerable insight. If we include this in his model. Then we need the node to POA mapping for all NEAREST neighbors. So calculating a route is *logically*: Calculate the route to the destination using the routing table information, Find the next hop, then choose which path to get to the next hop.

Clearly you don't build it this way. You create a forwarding table and use it the way you do now. Although, there is no reason one might not do a forwarding table update that just change the node to POA mapping without recalculating routes.

But what is interesting is that this mapping (node to POA of nearest neighbors) is exactly the same as the application name to node address mapping, i.e. the directory. Those are all *nearest neighbors* at that layer too! The whole structure is relative. One layer's node address is the point of attachment for the layer above. And it repeats. (That is what AS numbers were trying to tell you.) Although not necessarily in the obvious way.

With a structure like this, mobility is nothing more than dynamic multihoming. And several other things fall out easily, again see the book.


So here we are. 15 years after IPng and v6 doesn't solve any of these problems. No surprise. It was purposely designed not to solve any of these problems.

Some have noted that the IPv6 group thought this was just a data plane problem and ignored the so-called control plane. (Sorry, but I balk at the use of this phone company terminology, it confusing issues.) What sheer incompetence! As Radia points out in the 2nd Edition of her book, if you don't like NATs, you should have adopted CLNP. It was already in the routers. In other words, we could have spent the last 15 years on transition instead of on a monumental waste of money, time, and effort. Anyone who tried to explain these problems to the IPv6 group were simply labeled as sore losers.

Throughout the late 80s and 90s, if there was discussion of addressing someone (usually from MIT) would say, you have to read Saltzer's paper. During the NSRG meetings in 2001-2 it was brought up frequently. Then suddenly it was dropped. Never mentioned. When I pressed Noel on it not long ago, he said "they had moved beyond it." Seemed strange since loc/id was clearly not an answer, not a step to a solution. At least Saltzer looked at the whole architecture, while loc/id only looked at Network/Transport.

It begins to seem that Loc/id split had been invented so they don't have to admit they were wrong and simply name the node and get on with it. They seem to have an inkling that they had missed something important with v6 and they were desperately trying to find a way to retrofit it before it was too late. The trouble is loc/id split isn't the whole problem. Loc/id split (as near as I can tell) still does not name the node, but some application-flow-endpoint. Whatever it is a node address is necessary and it will need to be location-dependent and aggregateable and it isn't.

So what is really wrong with loc/id split. Lets look at it. If the IP address (the loc) remains a POA on which we do routing and giving them the benefit of the doubt, the id is a node address (in some papers the "id" seems to be more an application-connection-endpoint or something similar), then the loc is the provider dependent identifier and the id is the provider independent name. But it is flat. If multihoming is widespread it is likely that several endsystems in the same area will be using the same different providers for multihoming. Aren't the routers going to want to be able to aggregate the look ups for these to figure out where to send them? Not if the id is based on a flat name. Remember the relation of POA and node is relative. What is needed for one is going to be true for the other. Using a flat id assumes that it won't be needed much. But what we are seeing is that multihoming is becoming very widespread and I don't think we have seen anything near the end of it. The thing is that the node address (id) must be aggregateable as well. In any case, to build in an identifier at this level that does not facilitate scaling seems as short-sighted as v6 was to begin with.

But now is it too late.? At least for IPv6. The Internet architecture has been fundamentally flawed from the beginning. To be fair, it is a demo that never got finished. Basically this is like trying to build an OS for a huge set of applications with no virtual address space or application name space. Or as I say in the book, what we have is DOS, what we need is Multics, but we would settle for UNIX. The Internet architecture is equivalent to DOS.


I hope this helps.  The medium makes it a bit hard to explain.

Take care,
John




-------------------------------------------
Archives: https://www.listbox.com/member/archive/247/=now
RSS Feed: https://www.listbox.com/member/archive/rss/247/
Powered by Listbox: http://www.listbox.com

--
Richard Bennett





-------------------------------------------
Archives: https://www.listbox.com/member/archive/247/=now
RSS Feed: https://www.listbox.com/member/archive/rss/247/
Powered by Listbox: http://www.listbox.com


Current thread: