Why Networks Should Be Designed Bottom-Up

Michael Lesk

There are lots of arguments about network design. We have stars vs. buses. We have token rings vs. CSMA/CD. We have left-first addressing vs. right-first addressing. But perhaps one of the most important from the practical point of view is the choice between centrally administered networks and locally accumulated networks. UUCP (Unix to Unix copy), of course, is the prime example of a locally grown network; there is no central administration at all and so nobody can even tell you how many sites there are on it. By contrast, most people who design networks believe in central control. In this talk I'm going to discuss the past of UUCP and the future of more general networks, to explain why I think bottom up design is the best course.

UUCP began as a cheap kludge. Originally what I really wanted to do was remote software maintenance. I had been rather drowned in keeping things like -ms and tbl running on about 40 computers around Bell Laboratories, and I counted for a while and discovered that about 70% of the bug reports were fixed by simply giving the complainant the latest version of the software. So it seemed that some kind of automatic distribution would be a help. There were the usual questions ("I don't want anybody installing new software on my machine without checking with me") but I thought I had an answer to that, based on a scheme of automatic regression testing. The more immediate problem, however, was how to get software from one machine to another. I wasn't anxious to do unnecessary work, so I inquired around Bell Labs for good networks that might be used to connect all the Unix systems. I found at least three different groups, each of which insisted that they had the solution and in about a year they would begin installing their magic carpet on all the machines. Well, I didn't want to wait a year (I also had some skepticism about these groups, mitigated by the thought that there were so many of them). So I started looking for a cheap way to get things from one place to another. There was already a "cu" program (call-unix) that would dial up one system from another, so I thought I could try running it from a script. It seemed quickly to make more sense to write a C program; the bootstrap was only about 200 lines long. This was of course without any administration or anything else. So when people called with a bug report, I would ask for their userid and password (in those days, this was always given immediately) and then put their machine in the table. Pretty soon there were about 50 machines connected, and I could send files to any of them.

The job of keeping UUCP running then prevented further progress on the software maintenance front. During much of this time I was fighting a rear-guard action against the assaults of the managers who thought this program was an awful security hole and should be abolished. I lost, as you all know. (In the 17th century a man named John Hill attempted to establish a privately run penny post for England, but was stopped by the Cromwell regime which objected to both the loss of revenue and the loss of their ability to read letters.)

The good side of all this management interest was that UUCP was rewritten, with elaborate administrative checks. This work was done first by Dave Nowitz, and then again later by Brian Redman, Peter Honeyman, and Dave Nowitz. My thanks to them; they turned a research kludge into a fairly robust program. Some of the administration added was an advantage (you now get told when your mail can't be delivered); a lot of it has meant that UUCP is not a useful interface for closely linked communities, because it can not deal directly with user files, always working through spool directories [Nowitz 1984, Kolstad 1984].

Today, better networks have made much of the dial-up nature of UUCP obsolete. The performance of hardwired networks, and the general availability of links such as Ethernet or Datakit, means that few people would want to use dial-up within a building. Without dial-up, the kind of queuing and administration in UUCP is no longer sensible. Numerous proposals have been made for replacement, redesign, and the like. One is even familiar to you since it runs successfully here [Dick-Lauder 1984]. But I think some of the lessons from the program are useful and can be applied in future networks. For example, UUCP has had features providing a choice of access methods, the idea being that if there were both a high-speed and a low-speed access method, you'd try the high-speed first and the low-speed route when the first choice fails. This turns out to be a bad idea; if people get accustomed to a high-speed route, they send so much material that you are better off waiting for it to become available than using a low-speed route which will take very much longer. This lesson was learned between 1200 and 300 baud.

The script-driven style of UUCP internals was also useful; the major difficulty has been the inability of such scripts to handle more complex choices and procedures. But the adaptability of UUCP to a variety of system procedures has been fairly good.

A more important message is the importance of being able to maintain the same software interface while switching the hardware connectivity underneath. UUCP originally started with dial-up connections but now the typical mail user has no idea what the actual link between sites is.

Another major lesson is the need to build networks which do not demand to "own" the systems they are connected to. Many designers of networks can think of schemes which require kernel modifications and effective control of the systems on the nets. This is, in practice, intolerable. Many users can't change their kernels; most don't want to. A network has to plug into a variety of systems at the minimum level of disturbance. Bitnet, for example, follows this lesson. It has been very successful connecting a large group of complex systems, many IBM MVS, which are not easily modified.

Electronic libraries are another example of this general problem. It is very tempting to design a system in which you own all the data, store it on your machine, and then offer various services to people who dial in. This is even efficient and effective. However, it doesn't match what the publishers want. In general, they are worried about the consequences of access to electronic data, and they respond to the fear by maintaining control over the data. Often they have particular systems of their own, and insist that the only access will be via this system. It is inconvenient to realize that different publishers will not put their databases together; it resembles the days when movie actors were under contract to particular studios and certain obvious pairs never acted together. Similarly, one may not be able to combine information from a dictionary and a thesaurus unless they come from the same publishing house.

What of the future? Computer networking today is rather complex [Quarterman 1986, Jennings 1986]. Many large installations have to employ gurus to deal with networks and remember odd pieces of information such as "the Netnorth-Bitnet gateway is at British Columbia" or "the Bitnet-Arpanet gateway is at CUNY". (For a while my mail to Toronto and Waterloo went from New Jersey to Ontario via Vancouver, a 6,000 mile path for a 400 mile trip. This proves bandwidth really is cheap today.) But the complexities of networks are a problem in making electronic mail ubiquitous. I did not expect fax growth to overtake email growth; part of this is the standardization on fax (another part is the formatting capability it allows). It would be better if we could make electronic mail compatible among the communities and systems that we all use.

How is universal electronic mail, similar to paper mail, going to come about? There are two general approaches, the centralized and the separated. The centralized approach is exemplified by ISDN. ISDN is an international standard, and is intended to be offered as a service by the telephone companies. It has been an awfully long time coming, and even after implementation starts it is likely to be quite a while before it reaches any kind of ubiquity. The alternative is the continued accumulation of local networks. Most large businesses today have some kind of internal electronic mail. The research community is connected through a linkage of these internal systems either through the Internet/Bitnet systems, or just via dial-up connections similar to UUCP.

The centralized approach creates several difficulties. First, it often requires years of negotiation to overcome administrative difficulties. As computers get cheaper, they become more numerous and it becomes impossible to suggest that the various owners get together and agree on anything. I can not imagine, for example, trying to force the staff of Bellcore to agree on one word processing system. (I can imagine translating a few into a standard form, though). You can get people to buy something new and add it to their system; you can not get them to give up something they have been using and that they like. You can hardly get them to give up something they have been using and that works, even if they don't like it. Think of how long it took to get domain addressing throughout the Arpanet [Partridge 1986]. Of course, the need to remain outside the operating system delays really high performance networks. We have to hope that the OSI model will let us provide these. Its complexity does tend to discourage this.

Another difficulty of centralized systems is the tendency to multiplex at too low a level. In order to achieve high throughput, a network is going to want high-bandwidth channels; in order to minimize costs, these will be shared. But this is inefficient. Ideally, multiplexing for minimal cost should be done only once, by the transmission carrier which actually handles the messages. If it is presented with all the traffic, it can come up with an arrangement of circuits which will minimize cost or maximize reliability, or whatever the goal is. If only part of the traffic is presented, then only a partial optimization can be done. Present bureaucratic and regulatory constraints often encourage the use of high-cost pathways. I hope that the advent of optical fiber will lower the cost of transmission enough that optimization of link cost won't matter as much.

Most important, however, of the difficulties with centralized networks is the extension of human bureaucracy to electronic networks. We are all familiar with mail headers that are longer than the messages they carry, and with the enormous growth of mail programs. I can remember when Unix mail was about four pages of source code; today the source code is 1300 pages. This has, in effect, made the overhead of connecting to networks and maintaining a mail interface much higher. Worse, it is not a routine operation: various networks serve academic sites, sites connected to particular government agencies, sites using particular machine styles, and the rest. To join a net like Arpanet or Janet is not routine; you have to be a member of the approved community. This is causing electronic mail systems to fragment into systems serving particular groups, and frustrating those just outside the boundaries. For example, last year the British temporarily cut off transatlantic mail access to a particular set of users who were characterized by the fact that they DID pay their bills themselves. It all makes sense, sort of, if you know the whole story, but it is infuriating to those who lose service.

What's happening next? Distributed systems are spreading. We are beginning to see a continuum in the level of connection between systems. The advent of remote file systems, starting with libraries like the Newcastle Connection, has made some systems able to do the kind of remote copy that UUCP was originally intended for. There are transparent systems where the user doesn't even care which CPU is running what. More commonly, today, are systems connected (often by Ethernets) that allow remote copy and execution but do keep in mind who's running where. This makes security easier to enforce. We still don't have, in routine use, geographically distributed transparency; but we will soon, since this is now available in some experimental systems [Alberi 1987]. Soon, we should have distributed file systems connecting machines which do not run the same operating system, following the OSI models. Ideally, we can get a smooth connection between distributed systems and mail.

But more important to me is to have ubiquitous electronic mail. How is this to be achieved? I hope that UUCP will serve as a model for the development of connections between systems of different types, so that electronic mail can become as familiar as word processing and facsimile transmission. This involves some technical questions and some economic questions.

Technically, we need a "gateway" (as described by Judge Greene) which would interface all kinds of electronic transmission. It should permit all of information services, mail, file transmission, and other services. It should not require kernel modifications to computer systems. Just text-based electronic mail does not consume enormous bandwidth, nor extreme efficiency. How could such a system be built? I would suggest an idea of Robin Alston's (British Library): rather than try to standardize all systems, try to just get each to describe its format. That is, agree that any system, upon connection, will be prepared to describe its basic operations (in BNF, or some kind of object description language, or whatever formalism seems appropriate). Then, each system could ask upon connection for a description of the expected format. What I can not see working are isolated systems which hope that everyone will use them for electronic mail and nothing else (eg the initial version of British Telecom's Telecom Gold service).

Another technical problem is to deal with security. Security precautions present difficulties to people trying to do work. An efficient security precaution presents much greater difficulties to an intruder than it does to a legitimate user. The problem is that ordinary users are willing to spend, on every transaction, much less effort than a thief or vandal is willing to spend to do just one transaction. In practice I believe that we should try two general approaches to security: (1) rely more on authentication, so that we know WHO did something, than on sorting legal from illegal acts; and (2) rely more on a variety of small barriers, rather than trying to find a single technical solution to security, since the greatest dangers often turn out to be human frailty rather than technical decipherment. I look forward to the ability of digital phone switches to do calling number identification so that we can identify the people breaking into systems.

The administrative problems are more severe. Electronic mail today is rarely paid for by the message and often is paid for in ways that make it difficult for people to join even if they are willing to pay the bill. We need networks that are on a straight fee basis; fortunately, systems like UUCP and Bitnet can be reasonably cheap. This should make it possible to have a system to which universities, corporations, and individuals can all connect. If connecting to an electronic mail system merely meant the payment of a fee and the installation of some software, rather than elaborate legal negotiations, I believe it would spread much faster. I do not think that a purely separate mail system will prosper, though, since people prefer to use a single computer system for most of their business, not to have separate machines for every purpose. For those that don't have computers at all, there are various systems (eg Dan Nachbar's at Bellcore) which can provide a low-cost paper printing electronic mail systems.

What I do not believe will work is an imposition of a single system of electronic mail from the beginning. The telephone network, the postal system, the railways, the electric utilities, and the airlines all started as isolated instances. Here in Australia your railway gauges still show the remains of the independent planning, and yet the trains ran. In some countries, some of these utilities have been combined; others have been kept separate. But in all cases, standardization followed technical development. Technology may provide too many ways to do something, but planners will provide nothing.

Fortunately, the trend in general today is against centralized systems. European countries are thinking about breaking up their PTTs; the UK has separated British Telecom from the Royal Mail and is now separating long distance from local service. I hope that international electronic mail can prosper as a relatively uncontrolled service, rather than automatically attaching to any particular organization.

In the longer-range future, what would a "super-UUCP" look like? Well, it won't be a Unix program anymore; it will connect all kinds of systems. It would take from the Arpanet higher speed services; it would take from the standard UUCP the idea of no administration. It will have some pricing for its services. It will have positive user identification for security (which will also mean that it could be used for commercial transactions). And it will be as ubiquitous as the other basic communications services: paper mail, voice mail, facsimile mail, and telephone calls. The historians of the future, instead of despairing because the evanescent telephone conversation has replaced the permanent written letter, will despair at the quantities of electronic mail saved in historical records (although computer searching tools may ameliorate the difficulty). And even average individuals may have the attachment to electronic mail, if not to netnews, that they felt for paper mail a generation ago:

       "None shall hear the postman's knock
       without a quick of heart;
       for who can bear
       to feel himself forgot?"
           -- W. H. Auden, "Night Mail"

References

[Alberi 1987]. J. L. Alberi, and M. F. Pucci The DUNE distributed operating system Private memorandum.

[Dick-Lauder 1984]. P. Dick-Lauder, R. Kummerfeld, and R. Elz; "ACSNET - The Australian Alternative to UUCP," Summer USENIX Conference pp. 11-17, Salt Lake City, Utah (June 1984).

[Jennings 1986]. D. M. Jennings, L. H. Landweber, I. H. Fuchs, D. J. Farber, and W. R. Adrion; "Computer Networking for Scientists," Science 231 pp. 943-950 (1986).

[Kolstad 1984]. R. Kolstad, and K. Summers-Horton; "Mapping the UUCP Network," Uniforum Conference pp. 251-257 It is also no longer sensible to bootstrap UUCP onto a new machine by sending a short, 200-line hook over with "cu"; the distribution is too large., Washington DC (January 1984).

[Nowitz 1984]. D. A. Nowitz, P. Honeyman, and B. Redman; "Experimental Implementation of UUCP - Security Aspects," Uniforum Conference pp. 246-250, Washington DC (January 1984).

[Partridge 1986]. Craig Partridge; "Mail routing using domain names: an informal tour," Summer USENIX conference pp. 366-376, Atlanta, Ga. (June, 1986).

[Quarterman 1986]. J. S. Quarterman, and J. C. Hoskins; "Notable Computer Networks," Comm. ACM 29 pp. 932-971 (1986).