Petter Henriksen
The Norwegian Publishers' Association's Committee
on Electronic Publishing
Hjemmets Bokforlag
0122 Oslo, Norway
Abstract: In recent years, hypertext has caught the fancy and interest of the information technology community. As time goes by, we see that hypertext is not the only sophisticated manner in which to structure information. This paper seeks to place hypertext in its rightful theoretical context, as one of many examples of the more general concept information tectonics. This is a field that information handlers such as librarians and publishers have dealt with throughout their history, more or less consciously. The advent of information technology, however, has thrust information tectonics forward as a key discipline, and necessitates the development of a theoretical and terminological basis on which resourceful electronic information mediators can build.
My occupation and main interest in life is in structuring information, and I suspect I share this with many of you. However, I perhaps deal with information in another form than you do - I edit encyclopedias, whereas I gather many of you are librarians.
Indeed, as an encyclopedia editor, I very much feel akin to you as librarians. The classi-fication of encyclopedia articles, so important for later retrieval and editing, greatly resembles the classification of books in a library. So much so that my publishing house always has collaborated successfully with professional librarians in doing this task. Thus I feel that we have a common interest - and a common gift - in the realm of organizing information.
And as the information boom grows, it looks as if this gift of ours is growing steadily more important. It would seem that we have to shoulder a very special responsibility: Preventing the information boom from running out of control and transmuting into that feared chaotic state of affairs - the information meltdown.
Information is said to be our most helpful servant, but remember,
information that is not treated with care, quickly turns on its master.
2. INFORMATION TECHTONICS
There is much advantageous to be said about information technology, but the truth of the matter is that the emergence of the electronic database has done little to quell the danger of an information meltdown. I see this from the viewpoint of the electronic book publisher. The CD-ROM-disk renders possible the publishing of "books" with as much information as you find in about 1300 traditional books. Another kind of electronic book, online retrieval, greatly sur-passes this again. From my PC at my office I have online access to a material consisting of around 200 million articles, I am told. This kind of lebensraum makes for a dizzying feeling of emancipation for people like me, being used to heart-rending discussions of the kind: Should one cut down on Zambia or Zaãre in order to get the encyclopedia contents to fit into in the allotted number of volumes. And not all of us are up to placing the necessary restraint on ourselves in this situation. Now we can reach everything, we say gleefully, now we can have the whole information universe at our finger tips.
The other day I consulted a database that we have online access to. I was looking for a certain article I knew existed in the database, about sealfishing. I plugged away with key words like sealing, sealfishing and truncated seal, and I emerged from my search with 214 article suggestions, among them a charming bouquet dealing with the topic of sigillography - the field of stamps and seals. 214 articles richer, and 708 Norwegian crowns poorer (that is quite an amount, I can assure you), but not a trace of the article I was looking for. I was the victim of what we in my part of the world call a database massacre.
Later on I stumbled onto the explanation. The author of the article I was looking for, had unfailingly misspelled the word seal, with a double "e" like "seel". So that was that.
What use is there in having the whole world of information at your finger tips, if it stops there?
And my point is: The more information there is, the less value it will have, when it is stored in the raw. Information is doomed to stop at our finger tips as long as we refrain from structuring it according to the needs of the user. And this is what we call information tectonics.
Information tectonics is the organizing of information in structures. Good information tectonics is the organizing of information in structures adapted to the needs of the user.
Here are some examples of information tectonics:
2.1. Hypertext
Hypertext , is as exemplified in Figure 1, the map of a level in the Glasgow Online information system. This nebulous concept, which is the subject of so many conferences and symposiums, is one of several kinds of information tectonics, and, as a matter of fact, not even a new kind. Hypertext is just a trendy word for reference structure, and has been around as

long as there have been references in libraries or in the Bible or in the Talmud. I will return to hypertext later on.
2.2. One-dimensional Information Tectonics
Figure 2 shows a classical one-dimensional information tectonics such as the telephone catalogue, in an alphabetical series of elements. Easy to retrieve information from, if you know the name of the person you are seeking and how to spell it, otherwise totally unfit for use!

2.3. A Hierarchical Information Structure
Figure 3 shows that the Bible is a hierarchical information structure, with two testa-ments, books with numbered chapters and verses, well suited for random access by anyone who is already acquainted with the contents (and enjoys dropping scripture references), rather hard on the rest of us. After the Bible dictionary was invented, with alphabetical hypertext links into the main text, things have taken a turn for the better.

2.4. Crossgraining
Figure 4 illustrates a library with books stored according to the Dewey system. This is an example of crossgraining - the superimposure of conflicting hierarchical structures on each other. The primary hierarchical structure here is the division by subjects (the white arrows). Thus you can find all books which deal with the subject of farming in the vicinity of each other. Within the topical divisions you have subdivision by other criteria (the shadowed arrows) according to supplementary tables, for example by geography (country). Thus you will not find all the books dealing with Italy in one place in the library, you can find them only by skipping through the whole library one topic at a time.

2.5. User-Oriented Structure
The traditional newspaper (Figure 5) has a characteristic, user orientated structure. My morning newspaper is organized by sections, with references from the front page both to whole sections and - through ingresses - to individual articles. Within the sections we meet the so called bulletin board- structure, with a loosely connected collection of articles in juxta-position - articles that have relatively little in common. This structure is perfect for the semi-unconscious browsing through the newspaper practised by the major part of the population before the morning coffee cup has made its mark.

With these examples I hope to show that information tectonics is a characteristic of information in general, and not tethered to a specific medium. We have information tectonics both in traditional libraries and in electronic data bank. But as an expert discipline, information tectonics has attained critical importance with the advent of the electronic database. A physical library with faulty information architecture can be tolerated, you probably can find what you are looking for by browsing around a bit. But 550 megabytes of unsound information tec-tonics is a catastrophe. A discipline which throughout history has been leading an obscure existence in the back rooms of the information business, is now thrust forward as a key discipline.
Up to now, I have discussed a major pitfall within the field of electronic databases - the fact that their enormous capacity so openly courts information chaos. On the other hand, the electronic database, of course, enables a radically more user friendly information architecture than physical archives do. Physical archives, for all their handiness and appeal, post a physical restriction, a straight jacket, on the information tectonics: We have to choose one succession, one criterion of ordering. Even though we base a library or an encyclopedia on an information model that is multi-dimensional, that model must finally be projected onto the one dimensional axis that the physical archives in reality are. In the realm of book storage, we must choose one primary criteria of structure - the subject scheme of the Dewey system for example.
With the electronic database we can have our cake and eat it. We have multi-dimensionality.
Multi-dimensionality, by some called orthogonality, is a key concept in the field of information tectonics. Here are some others, and many of you who work with information dissemination will recognize them, if not the actual terms, then at least the concepts being denoted.
Information model. In figure 4 you see the simplified information model underlying the common newspaper. This figure demonstrates some of the fundamental symbols which can be used in portraying information models: nodes, both information bearing nodes, sym-bolized in the figure by rectangles (in this case newspaper articles), and empty nodes (circles), being junctions for pointers (references, links), in the figure symbolized by arrows. These arrows can be one way arrows, such as the ones designating references, or two way arrows, such as is the case between neighboring articles on a page.
Criteria for ordering, that is rules for succession and proximity. Among the most common: alphabetical, thematical, geographical and chronological. In the encyclopedia field we traditionally are torn between alphabetical ordering, which offers easy retrieval, and thematical ordering, which offers a more natural and pedagogical presentation.
Reality simulation. One of the maxims of information communication is that the more true to nature an information model is, the better it functions for the user. If you for example are writing a text-book on the history of music, you wish to describe a host of more or less parallel lines of stylistic development. Within the eighteenth century you perhaps wish to write on innovations in music theory put forward by Rameau in France, then to skip down to Vivaldis' Italy, then up to Bach in Germany and Händel in England, then back to Germanys empfindsamer-style and the roots of the symphony, and then a look at the opera buff in Italy. Thus we slowly work our way up towards our own time, through a desultory, fragmentary presentation of things that actually have unfolded continuously on several fronts simul-taneously. A more realistic information architecture gives us at least three dimensions to work with, one for the time axis and two geographical, as on a world map, and also preferably one thematical dimension, for the various music forms. This will enhance the end product in several ways. It will become easier to navigate in and retrieve information from, and it will become more logical and pedagogical.
Multimedia. While we are dealing with simulating the real world: Our world is, of course, a multimedia world. Information dissemination which ignores this, gives an impaired portrayal of reality. In the encyclopedia profession we struggle with this fact daily on paper. We describe a bird's song in words: fidiwit-fidiwit-fewy-fewy. We describe a film producer's style using a photograph. Paper simply lacks the channels, the sensual stimuli, to replicate the real world.
Electronic multimedia works bring us considerably closer. We can hear the song of a warbler, we can see a video sequence of vintage Fassbinder. We still have not reached our destination, however. I know of no multimedia works in which the bouquet of a Beaujolais still doesn't have to be described by an adjective: "fruity", or the skin of a rhinoceros as "callous".
Multimedia technology presents a very special challenge to publishers: how to knit together all these diverse media - text, sound, illustrations, video - in a technically feasible and still easily accessible way. The principle problem, of course, is not exactly of recent date. The classical challenge in printed books has been the positioning of text in relation to illustrations, the latter of which make special demands on the production process. In the old days, pictures were gathered on special pages, which were bound in between the text signatures. We subse-quently have worked our way through 4-1 and 4-2 printing, with colour pictures on every other spread. Now we at least usually can place the colour pictures on the same spread as the corresponding text.
In electronic books the possibility for relevant illustration location is even greater - and the unsolved problems many, many more.
Hypertext. Put simply, hypertext is just another word for reference network. Hyper-text is in principle free and unstructured. All links are permitted, following the whimsies of the author.
Hypertext is often extolled as a kind of emancipation from centuries of slavery under the yoke of paper, but bear in mind that this kind of free, unregulated architecture does not necessarily lead to salvation, even though it has a strong appeal to the most free-swinging information pioneers among us.
Free hypertext is wonderful for implementing associations which aren't covered by semantical systems, and there will be a great many of these in any information base. But it is also a treacherous cushion for the information organizer. Your information tectonics are very poor indeed if the only links you have, are of the kind that lead from the art of France to the Gothic period to the composer Perotinus to the composer Landini to Italy to the world cham-pionship in football. At the basis of every collection of information you must find a solid system of lucid, logical, thematically stringent structures, which you can put your data objects into neatly and conventionally: All French artists in one compartment, all Italian composers in another. And then you can add your ad hoc hypertext connections for all the relationships that don't fit into the system.
Unrestrained hypertext leads straight to information meltdown!
Degree of access. Do we wish direct or indirect access along a link? References in ency-clopedias traditionally are indirect, you have to pull out another volume and search. Clicking in hypertext programs usually gives direct access. The object you seek pops up on the screen at once. The traditional footnotes in books give a kind of semi-direct access. They are placed at the bottom of the page, near the spot from which the reference has been made. Nevertheless, you have to make a jump in reading.
In many electronic books you also have to put up with indirect access. You have to search in an entry word list for the word you are looking for, and then click.
Modular size. What is the right length for your data elements? Should there be many small or few large articles in an electronic information base - so called centralized or decen-tralized editing? Here we, ironically enough, often meet with a restriction in electronic media in comparison with printed matter - owing to the size of the data screen. Hypercard articles have an uncanny tendency never to exceed the length of one card. Beware! We are courting danger, the ultimate prevalence of the tabloid syndrome.
Micro-structure. Up to now, I have spoken on relationships between data elements. A fascinating world is revealed also on the level below, in the structure within elements, for example encyclopedia articles. Favorable micro-structure is necessary if you want your data-base to have simulated intelligence, if you for example wish to answer questions like "what is the longest suspension bridge in Europe?" To achieve this, all articles on bridges have to inform, by recognizable criteria, what kind of bridge it is, and the length. If these conditions are satisfied, you also automatically can extract statistical information from your database, for example a table showing all suspension bridges in Europe, sorted by length.
Free text searching, that is searching for certain words or letter combinations in a text corpus, in other words the tool you usually have at your disposal when you are trying to find your way in an online database, and the taximeter is running. I have put off the discussion of this concept to the last part of my paper, because - let us keep this vividly in mind - free text searching is not information tectonics, it is an aid to be used when information tectonics have not been applied, it is a surrogate for information tectonics. Free text searching is an invaluable aid in groping your way through large, existing, unstructured corpuses, but nothing more. Never fall for the temptation to believe that free text searching can replace information tecto-nics!
On the other hand, we do see ever more sophisticated examples of smart free text searching, or second generation free text searching. These are systems equipped with word lists with lemmatation and synonyms, so you - if you search for the word "country" - also are offered occurrences of words like nation, nation's, land and lands. They also lend you a helping hand through the minefield of spelling errors, euphemistically dubbed fuzzy spelling. If you try to look up the composer Mr. Betofen, you will get your man - although perhaps also an indulgent reprimand.
Finally, I would like to offer a comment on the work that has been laid down up to now in the field of electronic databases. So far much invaluable effort has gone into the techno-logical aspect of the products - hardware, software, technical quality. The Grolier Multimedia Encyclopedia on CD-I for example offers a total of 4 alternative audio qualities, really a tantalizer for acoustical connoisseurs. But at the humanistic end of the scale, where the contents of the products are dealt with, the domain of information disseminators like us, not that much has been done. A common procedure in my field - the field of publishing - has been to transfer an existing book product, with its one-dimensional architecture, right onto an electronic medium, perhaps freshening it up with a free text search option. In other words a rather apprehensive approach to IT, but, of course, a practical and economical way of getting exposed in the market place.
Happily, some exciting new products have emerged on the market recently, exploiting the possibilities of multi-dimensionality, and making the jump into what we might call the realm of true electronic publishing. A noteworthy example is the Compton's Multimedia Encyclopedia on CD-ROM, issued by the Britannica Corporation. In this work, the elements of the database - the encyclopedia articles - can be accessed by no less than 6 independent structures - alpha-betically, thematically, geographically, chronologically, by free text index and by picture index, as shown in Figure 6. I have tried to show this in figure 6, but typically for advanced electronic tectonics, it just does not project well onto a two-dimensional sheet. I can assure you that this is a work where you can have your cake and eat it - all articles dealing with farming are to be found side by side, as are all articles dealing with Italy - also those on farming. Let this work stand forth as a guide post for our future work on electronic information bases.

3. CONCLUSION
I hope that I have made clear to you the fact that information tectonics - although new as a key discipline - is a field that we as information disseminators have been dealing with throughout our history, that we have forged it into a specialty, and that we also should be dealing with it when electronic databases become even more common than they are today. This we can not take for granted, at least not in the publishing business. Franklin, with their hand held dictionaries and encyclopedias, reportedly sell more dictionaries than anybody else, and they aren't even considered to be publishers.
Let us therefore join forces within the humanistic disciplines to retain
our strength in this field, and to keep electronic information dissemination
focused on what really means something to the user - the
contents
of the database and not the packaging.