IMPLICATIONS OF THE NEW TECHNOLOGIES
FOR LIBRARIES IN DEVELOPING COUNTRIES

Implicaciones de las Nuevas Tecnologías para
las Bibliotecas en Países en Desarrollo

Robert M. Hayes

Graduate School of Library & Information Science
University of California, Los Angeles
Los Angeles, CA 90024, USA

Keywords: Technology Applications, Information Technology, Imaging Techno-logies, Devceloping Countries, International Development, Images, Digitized Images.

Abstract: This talk reviews current developments with respect to information techno-logies, both hardware and software, with special emphasis on those related to digitized images. It discusses the sources and potential applications of those technologies in libraries and other information agencies in developing countries -- in education, medi-cine, water use and land use, agriculture, and similar areas of special importance to such peoples and governments. It identifies barriers to the effective use of the technologies and suggest possible answers to them. Particular attention is given to those barriers that can be removed through education, and to roles for cooperation between educational institutions in providing opportunities for such experience.

Resumen: Esta ponencia reseña los avances y el desarrollo de tecnologías pertinentes al area del manejo de información, incuyendo equipo y programación, enfatiza de entre estos avances la producción de imagenes digitalizadas. Discute las fuentes y los poten-ciales de estas tecnologías en bibliotecas y centros de información en los países en desarrollo--educación, medicina, manejo de recursos acuíferos y terrenos, agricultura y areas de similar importancia para el pueblo y el gobierno. Identifica las barreras para el uso efectivo de estas tecnologías y sugiere posibles soluciones. Dedica atención parti-cular a aquellas barreras cuya solución media a través de la educación, y también a aquellas que pueden eliminarse mediante la cooperación institucional.
 

 
1. INTRODUCTION

Without question, digitized images already have become vitally important to industry, government, medicine, and academic institutions. This form of data poses both tremendous potential and monumental problems. The potential, of course, derives from the new means for conceptualization of problems that are created by this kind of technology. The problems arise from the fact that the amount of data that can be generated in this form -- from satellites, instrumentation, publication, conversion -- exceeds by orders of magnitude that normally considered as a library problem. Yet all of the functional needs, for storage, cataloging, access, and processing must be dealt with.

It is evident that digitized images will be of increasing importance to instruction and research throughout every university and college. They apply to problem areas at the cutting edge of resear-ch; they will play a critical role in instruction. The work done by Professor Ching-chih Chen is a clear example of what can be accomplished as well as being one of the pioneering efforts that have made it a reality. Technological developments now provide the necessary supporting capabilities: The scanners are available and are inexpensive; the storage capacities are now consistent with the needs; the transmission capacity needed for communication is operational; the processing power is now widespread, with supercomputers increasingly available and even microcomputers appear-ing with the requisite capabilities; the software development is well in hand.

For the librarian responsible for management of information resources, digitized images raise problems of special strategic significance. Primary among them is the overriding concern with the management of the files themselves; here there needs to be fundamental research on the organiza-tion of such files and on the means for retrieval from them. Almost equally important are adminis-trative concerns about access to facilities, resources, and equipment needed to acquire such data and to analyze it -- spectral analyzers, monitors, supercomputers and communication lines to them. Indeed, primary among the supporting equipment are supercomputers, needed because the volume of processing required for use of digitized images exceed the capacity of even very large main-frames.

In this talk I will try to provide an overview of the technologies, the applications, the librarian's contribution, and the potential barriers with which the librarian must deal. It will follow this frame of reference:

• Imaging technologies

• Sources of digitized images

• Uses of digitized images

• Contributions of the librarians

• the potential barriers

2. IMAGING TECHNOLOGIES

The technologies that are important to the creation, management, and use of digitized images will be discussed in the obvious categories: Hardware and Software.

2.1. Hardware

• Scanners

Scanners are the means for creating digitized images from sources external to the computer itself. At the simplest level, we today have inexpensive page scanners (costing, in the United States, on the order of $200 to $300) that can provide resolutions of 200 to 400 lines per inch. At the more complex level, as I will briefly discuss later, the scanners (such as contained in satellites and in medical equipment) are exceptionally expensive.

• Storage

Storage of scanned data or of images generated by algorithms is of course necessary for all subsequent processing. For a typical 8-1/2" by 11" page, scanned at say 300 lines per inch, that involves storage of about a mega-byte of data for a pure black and white image; for typical text pages, the data are so redundant (with large areas of white space, for example), that they can be compressed, usually by a factor of ten; but for gray-scale or color, the storage requirements are substantially greater. Even with compression, though, the means for storage of any significant number of images must have very high capacities. Fortunately, both magnetic and optical means for storage are now readily available with capacities completely consistent with the needs in image storage. CD-ROM optical disks in particular provide capacities between 500 and 1000 mega-bytes, sufficient to store as many as 10,000 compressed images.

Processing

Processing of image data requires computing capacities of a high order. Indeed, primary among the rationales for development of super-computers was the need for that level of computing power. However, we are beginning to see speeds, internal storage capacities, and functional capabilities at the microcomputer level that make it possible to consider use of them for a large range of image processing applications.

Display

Display of image data requires exceptional resolution but again the technology has advanced so far so rapidly that today we have screens that provide a quality approaching that of the printed page, with truly remarkable mixes of color.

Communication

Finally, communication of digitized images, given the number of bytes required and the frequency with which images may be generated, requires truly exceptional bandwidth. Unfortu-nately, while the technology is now available to provide that kind of capacity, making it available requires an infrastructure that many developing countries and even some highly industrialized countries do not have. In fact, of all the technologies needed for utilizing digitized images, com-munication may well be the one which poses the greatest problems, since it requires a national investment, whereas all of the others require only local institutional investment.

2.2. Software

Now, let's turn to the software:

• Interfaces

Interfaces are the means by which the user may interact with the computer in the use of digi-tized images. Increasingly, we have seen the use of GUI (graphical user interfaces), pioneered by APPLE computer but now embodied in WINDOWS for use with IBM personal computers; the GUI exemplify the role of interfaces at the simplest level. At more complex levels, we see capa-bilities to manipulate images, combine them, navigate through them.

• Algorithmic production

Algorithmic production, exemplified by CAD/CAM, for example, is a set of software that provides means to generate images from specifications.

• Scene analysis

Scene analysis is a process in utilizing a succession of images that provide a continuity in representation of an object or an event. The classical example, of course, would be a "scene" in a motion picture -- a succession of frames taken from a single camera setting; others might arise in the images generated in a "CAT-scan" of the human body or in the frames in a succession of satellite images. The tasks in scene analysis are first, to identify when a succession of images are related as a scene and, second, to manipulate the set of images constituting a scene as an entity Spectral analysis is a set of processes for deriving characterizing parameters for an individual image and as the starting point for identifying components of the image -- the first stage in pattern recognition.

• Image enhancement

Image enhancement is an example of pattern recognition, in which areas of the image are examined and modified to bring out details, correct for problems (in exposure, for example), and identify and highlight identified patterns.

Optical character recognition

Optical character recognition is the most well developed capability for pattern recognition, permitting the computer to identify typewritten and printed characters. Today, OCR software is available at prices ranging from a few hundred dollars to $20,000 and more. For a limited set of fonts, there is not much difference in the performance, with even very inexpensive OCR software providing accuracy at the 98% - 99% level. For dealing with a wide range of fonts, though, the more expensive software clearly is necessary.
 

3. SOURCES & USES OF DIGITIZED IMAGES

I turn now to a brief review of the sources (Figure 1) and uses (Figure 2) of digitized images.

3.1. Sources of Digitized Images

At least four categories of digitized image data can be identified as shown in Figure 1. The first results from conversion of source data to digitized image form; this arises in FAX transmis-sion, in the preservation of brittle books by conversion to optical disk images, in the conversion of motion picture film to optical disks. The second derives from algorithmic production of images, as in computer aided design, architectural design, or cartooning. The third arises from the monitoring or observation of physical processes; this is illustrated by data from satellites scanning the earth and other planets; it includes data from scanning of persons, as in radiology and neurology. The fourth arises from use of digitized image storage for administrative data files.

Figure 1. Sources of Digitized Images

_____________________________________________________________________

Scanning of Source Materials

• Document Pages • Microforms • Motion Picture Films • Slides • Original Art • Video

Algorithmic Production

• CAD/CAM • Cartooning • Architecture

Observation of Experiments

• Physics • Chemistry • Biology

Scanning of Natural Phenomena

• Human Body • Satellites • Geologic


____________________________________________________________________

The first source is of special importance to libraries -- the scanning of source materials, of document pages, of microforms, of motion picture films, of slides, of original art, of video recordings. In a moment, I will comment on the specific importance to libraries, so here let me illustrate with examples from scholarship -- the analysis of digitized images of the Shroud of Turin and the analysis of the painting in the Sistine Chapel.

I have already commented on software for algorithmic production. It arises specifically to provide means for generating images from specifications in either parametric form or image form. Examples include CAD/CAM (computer aided engineering design and manufacturing), architectural design, and cartooning. Once a digitized image has been created in this way, it may well be stored for later use or for manipulation independent of the source program.

The generation of images from the observation of experiments has become a crucial means for presenting otherwise overwhelming amounts of data in compact, visual form -- observation of collisions of high-energy particles in an accelerator, of the progress of a chemical reaction, of a succession of functions in an experimental animal.

The scanning of natural phenomena -- of the human body, of the earth and the other planets by satellites, of geological structures by a variety of means for observation -- is the set of sources that can generate images at truly mind-boggling rates. They differ from the observation of experiments in the fact that they things being observed are not being controlled, so the range of phenomena observed is dramatically greater, and in the fact that they can occur continuously.

3.2. Uses of Digitized Images

Most important among the uses (Figure 2), of course, are those reflecting the needs that led to creation of them in the first place -- the needs in engineering, architecture, cartooning; the needs in experiment; the needs in health, geological exploration, land-use planning, agriculture. Closely related to them are the potential uses in research and education -- as aids to developing concepts, as means for comparison and analysis of phenomena.

Figure 2. The Uses of Digitized Images

__________________________________________

Uses Specific to Source Objectives

• Algorithmic Production

• Experiment

• Natural Phenomena

Uses for Research & Education

• Conceptualization

• Comparison

• Analysis

Library Related Uses

• Preservation

• Document Delivery

• On-Demand Publication

• Convert Records using OCR

Administrative Uses

• FAX Communication

• Manage Administrative Files

• Convert Records using OCR


__________________________________________

Libraries have specific needs that may well best be met by the use of digitized images. Needs preservation, document delivery, and on-demand publication may all best be met through the use of digitized images; indeed, document delivery, using FAX, may well require it. It is also impor-tant to recognize the role if digitized images in the conversion of operating records, such as cata-logs and administrative files to computer processible form.

In fact, the administrative applications of digitized images may well become the most wide-spread in our society. Scanned administrative documents can be stored in electronic form, with automated work assignment and scheduling, automated work-flow control, and automated index-ing and file searching. The benefits arise from increased staff productivity, reduced costs for paper handling, improved control, better response to needs.

4. THE LIBRARIAN'S CONTRIBUTION

What then is the significance of all of this to the librarian? Aren't these all of concern only to those who generate and use this form of data? Don't they all depend upon technical knowledge of the field of application and of the technologies? In my view, the answer to those questions is that the librarian has many contributions to make and should play an increasingly important role in the management of digitized image files in ways parallel to those involved in management of print files and data bases. Indeed, my objective in this talk is to identify those contributions.

4.1. Mediation, Consultation, Training

• Management of personal files of images

• Use of hardware and software

• Accessibility of image data files

• Means for conversion of image formats

The first contribution is in mediation, consultation, and training. While researchers, students, engineers, and executives may know their own needs in use of any information resource, they frequently will need help in the management of personal files, and in the use of hardware and software. The librarian can serve as an consultant in such cases, bringing to bear experience as well as technical expertise. Of special importance, though, is providing assistance in gaining access to available image data files that a given user may need and in conversion of those data to the form needed.

4.2. Selection & Acquisition

• Identification of sources

• National bibliography of digitized images

• Selection criteria

• Issues of intellectual property rights

• Costs & funding

Perhaps the most important contribution is directly related to gaining access to materials. Selecting and acquiring information materials is the fundamental role of the librarian. To fill that role, the librarian needs to know the available sources. Perhaps we need the creation of national or even international bibliographies of available digitized images. We surely will need counterparts for selection criteria. And we will face problems with respect to costs, sources of funding, and rights in use.

4.3. Cataloging

• MARC formats

• Sharing of catalog data

The other traditional function of the librarian is cataloging -- providing the means for iden-tifying and controlling holdings. Here we surely need to have MARC formats adequate to the requirements and the means for sharing of cataloging data so as to avoid unnecessary duplication.

4.4. Collection Management

• Access allocation decisions

• Physical organization

• Storage

• Preservation

As I have pointed out, digitized image files can raise spectres of sizes of files of truly awe-some magnitude -- sizes that make even the largest libraries of the world appear small. As a result there are even more complex problems in determining where the files will be stored, how they will be organized, and how they will be preserved.

4.5. Content Indexing & Abstracting

• Identification of image content elements

• Thesaurus of image "icons"

• Individual frame analysis

• Scene analysis

• Quantitative parameters

• Retrieval query structures

Perhaps the most challenging intellectual challenge arises from the needs in "content indexing and abstracting". How does one retrieve images that contain something desired? If the problems in "full-text retrieval" were interesting, they were dull compared to those in image retrieval. Lest I appear to be posing an irrational challenge, please consider the means by which OCR software matches images against standard patterns, consider the means available for identifying a scene, with the potential for use of one frame from a scene as a retrieval surrogate -- an "abstract" -- consider the possibility of using characterizing quantitative parameters as means for retrieval. In other words, the tools are already here.

4.6. Document Delivery

• Source forms and formats

• Conversion processes

• Communication processes

• Delivery processes

When we consider document delivery, we are directly in the domain of the librarian. It is clear that digitized images will play an increasingly important role in this area.

4.7. Sharing of Digitized Image Resources

• Need for a international inventory

• Means for access & delivery

• Forms & formats for digitized images³

In that context, then, the sharing of digitized image resources will be a vital component. This further emphasizes the importance of a national and even international inventory to serve as the basis for such sharing.
 
 

5. BARRIERS

There are a number of barriers to the implementation of means for meeting these needs. I will discuss them in five categories:

• Generic barriers

• Technological barriers

• Governmental barriers

• Individual barriers

• Library barriers

Generic Barriers - The most significant generic barrier is uncertainty when the techno-logy is rapidly changing, when there such a diversity of sources for the digitized image files. The result almost certainly is a "wait and see" attitude. Of course, economics is perhaps the most evident generic barrier.

Technological Barriers - In the developing countries, the lack of an adequate infra-structure -- communications, logis-tical support, availability of consultation services -- is an almost insurmountable barrier. Failure to conform to standards or, in some cases, a lack of them is a technological barrier. Failure to meet stated goals in technological development, with respect to both time of delivery and functional capabilities, has been a continuing problem. Clearly it sets a barrier to future development that depends upon availability of a technology.

Governmental Barriers - Import restrictions, taxation policies, bureaucratic proce-dures -- each of these is an example of a governmental barrier. Many of these are subject to the winds of politics, but others -- such as import restrictions -- are built into not only governmental policy but national economic structures.

Individual Barriers - A critical barrier is the differential use made of any kind of information resource. Some faculty, recognizing the value of information to their research and teaching, will be heavy users of libraries, computers, and other kinds of information source; other faculty will make no use of them, depending solely on their own work. Inertia will always be a barrier for the individual. The resource may be there, but it takes effort to use it, and for many that effort will not be made. The technologies are complicated to use, and require knowledge and facility in use. That facility, even once gained, is easy to lose unless there is continual use; for most faculty, usage will be sporadic with the result that the technology must be repeatedly re-learned.

Library Barriers - The development of expertise in the management and use of these technologies is a most significant barrier, though it is one that the profession can deal with. We need to develop com-ponents of our professional curricula and of continuing education programs that will provide the librarian with the skills require to make the contributions I have identified. Of all of the barriers, this is the one most easily removed provided we recognize the need to move forward.

6. CONCLUSION

In conclusion, I will simply state that digitized images are a medium of communication of information that is now of critical importance to research, scholarship, education, engineering, industrial management, governmental operations; as time goes on it will be of increasing importance to users in these fields. They should see the library as a means for meeting their needs in access to and use of this medium, and the librarian should be ready and able to provide assistance to them. That means that library education should being now to provide the technical tools that the librarians will need to fulfill their obligations.