APPLYING VISUALIZATION TO

APPLYING VISUALIZATION TO INFORMATION RETRIEVAL

Scott C. Deerwester

Department of Computer Science
Hong Kong University of Science & Technology
Clearwater Bay, Kowloon, Hong Kong
E-Mail: SCOTT@UXMAIL.UST.HK

The value of graphics to display statistical data is now widely accepted for both exploratory data analysis and in the effective communication of its results\cite XXXXX. Such transformation of data into images is called {XXXXX}. A considerable body of experience and supporting research on the topic of visualization has grown in the past several years, as well as insight into how to best apply these techniques \cite {XXXXX}.

People understand pictures both differently and quicker than they can understand words. Visua-lization techniques exploit the effectiveness of the human visual system to recognize spatial structure and patterns \cite XXXXX. It has been used particularly effectively in scientific and engineering computation \cite XXXXX.

Although there have been some efforts to integrate visualization into IR, much of this effort has focused on one of two things:

Representation of each document (or other retrievable object) as a point, with much of the effort placed on placing a large number of points in a projection of a 3D space so as to indicate the relationships between them. \item Semiotic representation of retrievable objects, either with an icon or some other pictorial representation, generally with a more limited number of objects on the screen.

The present work has a somewhat different basis. Here, individual objects are represented entirely abstractly, with particular visual attributes mapped to particular associative attributes. Objects are placed in a visual space according to other associative attributes.

In a sense, whereas previous approaches have relied on human visual processing either to under-stand the associative structure of a set of objects as a whole, or to understand individual objects, the present approach seeks to accomplish both in one context.

Many of these ideas were implemented in a prototype information visualization system called {XXXXX Marbles}, which is a graphical spatial browser that runs under NeXTstep on the NeXT workstation. {XXXXX Marbles} represents objects as colored marbles on the screen, and incorporates features for displaying and interacting with a three dimensional information space, as described elsewhere in this paper.

The use of visualization as a means of interrogating collections of information was first explored in M.I.T.'s Spatial Data Management System \cite {XXXXX}. SDMS explored two aspects of a spatially presented data; proximity as a metaphor for similarity, and human spatial memory. A particularly successful application of SDMS was a database of naval ships. In this application, ships were displayed on a world map. Ships were represented with icons based on ship type. All infor-mation about ships was presented textually.

The use of visualization in Bellcore's Telesophy project was somewhat more sophisticated, and closer in spirit to the present work. Telesophy relied on database browsing with a graphical repre-sentation of the database called an {XXXXX information space} (a term coined in connection with SDMS) as the primary means of retrieval. Objects were placed in a space, with relative position indicating relatedness, but were otherwise distinguished from each other only by textual labels.

Other visual approaches to IR have focused primarily on indicating clusters or groups of docu-ments. Information retrieval textbooks abound with such usage, although normally as explanatory examples. See, for example, \cite{XXXXX} and \cite{XXXXX} for usage of visualization for illustrative purposes, and \cite {XXXXX} for an example of a working prototype system.

One way of viewing an information retrieval system is as a computer program that allows its users to explore the relationship between two sets; one set of associative attributes and another set of retrievable objects. In a traditional keyword-based retrieval system, the associative attributes are keywords and the retrievable objects are documents. We should note two things about such systems. First, the set of associative attributes is very large. Second, the "resolution" of such attributes is very small, i.e. a given attribute doesn't have a very wide range of usefully distinguish-able values for a specific document. In many systems, in fact, a keyword is only counted as being present or not present in a document. The resolution of the keyword in such cases is a single bit.

Here, we wish to consider a third set; a set of {XXXXX visual} attributes. Our eventual goal is to link each of a set of associative attributes to a particular visual attribute. In the following section we discuss particular visual attributes, after which we discuss strategies for creating much smaller sets of associative attributes with much higher resolution.

Our goal is to render individual objects in such a way that similar objects look similar. Some work has been done on discovering how effective people are at evaluating similarity with respect to particular visual attributes \cite{XXXXX}, but to date a very small subset of the range of possible visual attributes has been explored.

The use of color is the obvious first possibility. Colors can be considered as a combination either hue, saturation and brightness or red, green and blue. People seem to be able to see similarity with respect to each of these three independently, suggesting that color can be used to represent three associative attributes.

As we discuss later, however, brightness is particularly appealing for use in representing rele-vance for retrieval. Saturation -- the "intensity" of the color -- seems natural as a visual metaphor for "amount of associative information", i.e. the more intense the color, the more the color "means." This suggests the need for further research to determine the most effective mappings of color to associative (and other) attributes for information retrieval.

Other attributes related to color include reflectance and opacity. Opaque, highly reflective objects look metallic; translucent reflective objects look like colored glass; slightly reflective, opaque objects look like plastic, etc.

We may also consider the structure of the objects as holding information. We could consider the regularly of the object at at least three levels.

• On a very fine-grained scale, the regularly of an object is visually interpreted as texture.

The use of fractals is an obvious way to construct images with varying degrees of regularity at different scales, especially using interactive fractal-generating techniques that construct features at progressively finer scale. See, for example, the {XXXXX MidPointFM2D} algorithm in \cite [page 100] {XXXXX} Fractal images have the additional feature of a "natural" appearance. Robertson suggests that using such images maximizes the ability of humans to understand relationships between them.

Finally, the size of the graphic object should also represent the size of the retrievable object. Graphic objects are interpreted as three dimensional, raising the question of whether to construct an object whose apparent volume, surface area or radius is proportional to the size of the retrievable object (where the latter is measured in number of words or some equivalent metric). Empirical evidence suggests that people are asked to choose an object on a computer screen "twice as big", or some other small factor, as some other object, they consistently estimate size by apparent area rather than volume or radius, for objects whose shape is very roughly spherical. Thus, the radius of the graphic objects should be proportional to the square root of the size of the retrieval objects.

Although we might hope to extend the above set of visual attributes, it seems unlikely the number of independent visual attributes available to us will exceed fifty -- although combinations of visual attributes certainly could, if we include particular shapes. The primary problem in thus to reduce the set of associative attributes to a smaller set. There are two options for doing so:

• We can reduce the entire set of associative attributes, replacing them with a different set of "better" associative attributes. This is called global reduction.

• In contrast, for objects that are similar in several attributes, but different in others, we can ignore those attributes that are shared, and consider only the most important distinguishing attributes. This form of reduction is called local reduction.

There are several available methods that accomplish global reduction. Two worthy of special consideration are multidimensional scaling and clustering. In the case of multidimensional scaling, objects are placed into a space of a specified dimensionality, and then moved until all objects are maximally close to those object that they most "resemble" (in the sense of sharing associative attri-butes), and furthest from those from which they differ. Clustering is a similar process, by which documents are grouped according to their similarity, again in terms of shared associative attributes. In some clustering applications, documents may be in more than one cluster. Both approaches accomplish global reduction by replacing a set of keywords with a set of other attributes -- vector components in the case of multidimensional scaling, and cluster identifiers in the case of clustering.

A more recent approach is latent semantic indexing, which applies a technique called factor analysis to construct a set of artificial associative attributes called {XXXXX factors}. The factors are uncorrelated, and each successive factor captures the maximum amount of co-variance of a set of keywords and documents not explained by preceding factors \cite {XXXXX}. The set of attributes generated by latent semantic indexing is guaranteed to be the smallest number with the greatest resolution.

Recall that our goal is that objects that are similar must look similar. The converse of this is that objects that are different should look different. This is a very strong criterion. To give an example, if two objects are completely different, then they may not share a value in {XXXXX any} visual attribute. They could not, for example, both have the same amount of red, if we use red, green and blue as attributes. In fact, a much weaker condition is satisfactory if we recognize that only objects that will ever be compared with one another need to differ. To give a completely non-technical example, if two people attend a party wearing the same clothes, it is a cause for comment. On the other hand, it is utterly irrelevant if one person in Boston and another in New York wear the same clothing at the same time.

Thus, to further reduce the number of associative attributes, we are entirely free to apply any of the {XXXXX global} reduction techniques -- multidimensional scaling, in particular, because it is computationally relatively inexpensive -- to only the objects in a particular local region. This requires that we be able to identify such regions, a subject that we address in the following section.

People can see three dimensions, but a computer screen is two-dimensional. If we place objects in a three dimensional space -- i.e. if map three associative attributes to three spatial coordinates -- then we must {XXXXX project} this space onto the available dimensions. The essential approach is to use two spatial dimensions plus motion to represent three dimensions.

Given that a computer screen has only two dimensions, we must use motion to represent a third dimension. This necessity is potentially the most costly in terms of computation. One can render a large number of objects in advance, but moving them -- especially if movement requires re-rendering them -- can be very costly.

The most natural form of motion, from a human point of view, would be achieved by projection of three dimensions to two, with full perspective, from a particular viewpoint within the three space. This is also extremely costly, since objects must (technically) be redrawn every time the viewpoint moves. Work with the author's {XXXXX Marbles} program and related work suggests other options that are computationally much more efficient, and which still offer a convincing represen-tation of a three space.

The {XXXXX Marbles} program places spheres within a certain radius of the center of a three space, and accomplishes movement by simple rotation, up to ten times per second. The algorithm re-renders the whole space at each rotation, by computing a screen location that corresponds to the three dimensional location of each sphere, where the whole space has rotated by a small angle $\Delta \theta$ and the space is tilted with respect to the viewer by an angle $\phi$ which ranges from -45 to 45 degrees. The projection

must be computed only once per rotation, there is a minimum of floating point computation. The spheres are placed at the location $(x_1, y_1)$ on the screen, in order of increasing $z_1$, i.e. from "back" to "front". This makes it unnecessary to do any hidden surface removal, since rendering each sphere covers up the portions of any sphere in back of it that it hides.

Given this performance with no special optimization, it should be possible to achieve update speeds on a standard workstation that appear smooth (i.e. 30 updates per second).

Something like full perspective can be gained by limiting movement to a particular plane, and simply distorting the objects, rather than using full perspective. Effectively, the object is "sliced" into vertical strips, each of which can be pixel-mapped onto the display with an appropriate amount of enlargement or reduction. Astonishingly convincing renditions of a three-space have been achieved using this technique in computer games using images with resolution as low as 64$\times$64 pixels. Wolfenstein 3D is an excellent example of such a program. Similar techniques might be adapted for use in information visualization.

Users should not have to look at more items on the screen than they can comfortably handle. Our own evidence suggests that this limit is in the vicinity of 100 objects. Further, even if people were comfortable with seeing 100,000 objects at once, we could devote at most a few pixels to each object if we did so, drastically limiting the amount of information available to the user about each object. we must have some reasonable way to limit the number of visible objects without making it hard to understand how the visible objects fit into the associative structure -- i.e. how they relate to all of the objects as a whole, and to objects that may be similar, but are not visible.

In the real world, objects that are far enough away are not visible. This can be for two reasons:

• If items must be "illuminated" to be seen, then objects that are sufficiently far from a light source are too dark.

• Items that are far enough away so that their apparent size to the viewer is sufficiently small cannot be seen.

Either can be applied here, but the second possibility depends on perspective, which is costly, and the first has the unpleasant connotation of hunting around in the dark with a flashlight. None-theless, applying thresholds to objects that are far from the viewer is a useful technique, especially when used with other techniques.

The most appealing means of hiding objects corresponds nicely to the real world. If objects can be grouped into large groups, then only the groups need be visible to the user at the outset. Further, nothing precludes placing a given object in more than one group. This method corresponds nicely to the hierarchy embodied in a library, a floor of the library, a set of bookshelves, a particular shelf, a book on the shelf, etc.

This is particularly appealing since it reduces the number of objects that must be compared at once, thereby making it possible to apply local reduction to the attributes associated with these objects prior to rendering them visually. Further, it opens the possibility of different sorts of reduction at different levels, perhaps applying latent semantic indexing to the set as a whole, then clustering based on the factors, then multidimensional scaling within clusters, etc.

Object that are associated should be "close". Identical objects should be adjacent. How far away should unrelated objects be? In a sense, the answer is that it doesn't matter, as long as they are "far enough". Although people are very good at reasoning in terms of small, comparable distances, they are very bad at reasoning in terms of very large distances. The problem of placing points at locations is aggravated by the fact that we begin, not with distance, but with similarities, which may be thought of as ranging from zero to one. We cannot simply set the distance between two points to the inverse of the similarity, because no two points can be infinitely far away. Rather, a similarity of zero (which is the most common value in most collections of objects) means "far".

The real problem is that not all distances are comparable, but most algorithms for constructing a space treat them as though they were. Experiments with different mappings of similarity to distance, using {XXXXX Marbles}, confirms that the problem of comparable distance must be considered within the algorithm for placing objects, not simply while constructing input for the algorithm.

How can such a system be used for retrieval? Work with {\em Marbles} suggests that "lighting up" objects with a brightness proportional to the similarity between the object and a query is very effective. People are able to spot relevant objects out of more than 100 other objects in a fraction of a second. The most appealing possibility for implementing retrieval is suggested in \ref {hier} -- by presenting a set of objects as a nested hierarchy, a object at the top level corresponds to a significant fraction of the whole collection. Selecting this object with a mouse is interpreted by the system as a request to reinterpret the query for the objects "within" this top level aggregate. Visually, the user clicks on an object and "goes inside of it", with appropriate visual cues (zooming, etc.) to indicate that this is what is happening.

An analysis of several large collections of information\footnote {The HK University of Science and Technology Library catalog and three months of Dow Jones Newswire articles were used as example collections.} showed that three levels of hierarchy should be adequate for any collection with about one million or fewer objects.

One problem with a completely abstract representation of objects is exactly that it {\em is} abstract -- there is no way to tell from just looking at such a graphic object which retrievable object it represents. Several techniques for solving this problem are suggested.

Systems for displaying pictorial representations of molecules based on chemical formula\ae employ a simple technique that we might borrow. Many such systems simply associate a label, or tag, with each object (atoms, in this case), and give the users an easy way to turn labels on and off.

Similar to the above, but more selective and interactive, the system can provide a "magnifying glass" that users can move back and forth over the visual display, through which they can read text associated with each object. The Apple Macintosh operating system (release 7.0) provides a similar sort of function in the form of "balloon help", wherein a cartoon-style conversation balloon appears wherever the user moves the mouse cursor, explaining whatever the user is pointing at.

Finally, there is no particular reason to insist that the {\em only} means of presenting a set of objects be visual. Nothing precludes the possibility of allowing a user to ask for a textual list of the objects on the screen, perhaps even with a very small picture of the objects next to the text.

Clicking the mouse on an object that represents an individual document results in opening a window with the text of that document.

Visualization applied to information retrieval opens some exciting possibilities, and also raises new research issues. Early experience with a very abstract information visualization system is a promising approach, enabling users to understand and use the system's responses to their queries both well and quickly.

The use of visualization in this way depends very strongly on a reductional approach to infor-mation retrieval to reduce the number of associative attributes. This need for reduction, as well as limitations on the amount of screen space and limits on how many objects humans can comfortably handle at once, suggests a visual hierarchy of subspaces with on the order of 100 items in each subspace, as an appropriate way of structuring a large collection.

This work has served primarily to reveal some of the research questions that must be addressed before this technique can be applied in a full-scale retrieval system. Some of the issues raised here are:

• There is a need for a taxonomy of visual attributes, and a methodology for combining them.

• We do not know how to use color most effectively to help users understand associative structure and relevance to queries.

Specifically, we currently have no basis to choose between a hue, saturation and brightness model, with saturation and brightness representing non-associative attributes (e.g. how much the system knows, and how relevant the item is), or a red, green, blue model, with each color dimension representing a different associative attribute \end {itemize} or some other mapping altogether of color to attributes.

The need to explore fractal imagery as an information visualization technique is clearly indicated by our early experience.

Even though a hierarchical structure is suggested on several bases, we do not have any real methodology or experience that suggest how to combine the various available structuring methods to construct such a hierarchy.

Rotation without perspective seems to be an appropriate means of mapping three dimensions to two dimensions plus motion. However, limited perspective projections allow us to put motion under the control of the user, and have been used effectively elsewhere. This suggests the need for research exploring this and other techniques that provide an effective illusion of three dimensions.

A final issue, which is more development than research related, is how to incorporate this style of interface into a real system, so that the visualization helps the user find and understand things without getting in the user's way when a textual presentation might be more appropriate. We hope that more experience will result in a more optimal design.

In conclusion, applying every abstract visualization to the problem of finding information offers hope of a retrieval system that passes Alan Kay's test: "If a kid can understand and use it, its design principles are probably sound enough that an adult will, too."

Beck, R.N. (1991). "Imaging science in the 1990s," In Proceedings of the SPIE - The International Society for Optical Engineering, 1396: 688-695.

Biggerstaff, T.; Matsumura, K.; Prieto-Diaz, R. & Schaefer, W. (May 1991). "Software Reuse: Is it Delivering?" In Proceedings of 13th International Conference on Software Engineering. Austin, TX: IEEE Comput. Soc. Press.

Brand, S. (1988). The Media Lab: Inventing the Future at M.I.T. Benguin Books.

Campbell, M.K. (October 1991). "Visualization: Picturing what we want to know and understand,"

Caplinger, M. (Summer-Fall, 1986). "Graphical Database Browsing," SIGOIS Bulletin, 7 (2-3): 113-121.

Chang, S.K. (March 1990). "Visual Reasoning for Information Retrieval from Very Large

Chang, S.K. (October 1990). "A Visual Language Compiler for Information Retrieval by Visual Reasoning," IEEE Transactions on Software Engineering, 16 (10): 1136--1149.

Crouch, D.B. (1986). "The Visual Display of Information in an Information Retrieval Environ-ment," in Proceedings of ACM Conference on Research and Development in Information Retrieval,

Deerwester, S.C.; Dumais, S.T.; Furnas, G.W.; Landauer, T.K. & Harshman, R.A. (September 1990). "Indexing by Latent Structure Analysis," Journal of the American Society for Information Science, 41 (6): 391--407.

Deerwester, S.C.; Dumais, S.T.; Landauer, T.K.; Furnas, G.W. & Beck, L. (October 1988).

"Improving Information Retrieval with Latent Semantic Indexing," In Proceedings of the ASIS Annual Meeting.

Feiner, S. & Beshers, C. (March 1990). "Visualizing N-Dimensional Virtual Worlds with N-Vision," Computer Graphics, 23 (2): 37-38.

Fenton, N.E. (1991). Software Metrics A Rigorous Approach. Champman and Hall, 1991.

Fraser, S.D. & Duran, J.M. (1989). "Software Indexing for Reuse," In Proceedings of IEEE International Conference on Systems, Man and Cybernetics. Cambridge, MA, 1989.

Furnas, G.W.; Landauer, T.K.; Gomez, L.M. & Dumais, S.T. (November 1987). "The Vocabulary Problem in Human-System Communication," Communications of the ACM, 30 (11): 964-971.

Helm, R. & Maarek, Y. (1991). "Integrating Information Retrieval and Domain Specific Approaches for Browsing and Retrieval in Object-Oriented Class Libraries," In Object-Oriented Programming: Systems, Languages, and Applications. pp. 47-61.

Henninger, S. (1991). "Retrieving Software Objects in an Example-Based Programming

Khoshgoftaar, T.M. & Munson, J.C. (February 1990). "Predicting Software Development Errors Using Software Complexity Metrics," IEE Journal on Selected Areas in Communications, 8 (2): 253-261.

Korfhage, R. (1991). "To See, or Not to See--Is That the Query?" In Proceedings of the 14th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, 1991. pp. 134-141.

Limoges, S.; Ware, C. & Knights, W. (1989). "Displaying Correlations Using Position, Motion, Point Size or Point Colour," In Graphics Interface '89. London, Ontario, Canada, 1989.

Lohse, J.; Rueter, H.: Biolsi, K. & Walker, N. (October 1990). "Classifying Visual Knowledge Representations: A Foundation for Visualization Research," In Proceedings of the First IEEE Conference on Visualization. San Franciso, CA: IEEE Computer Society Press, 1990.

Maarek, Y.; Berry, D.M. & Kaiser, G.E. (August 1991). "An Information Retrieval Approach for Automatically Constructing Software Libraries," IEEE Transactions on Software Engineering, 17 (8): 800-813.

Martin, L. & Payet, J. (1987). "Factor Analysis and Classification Methods: An Application to the Study of Corporate Custom," European Journal of Operational Research, 30: 68-76.

Meyer, B. (1991). "Tools for the New Culture: Lessons from the Design of the Eiffel Class Libraries," Communications of the ACM, 33 (9): 69-86.

Munson, J.C. & Khoshgoftaar, T.M. (1982). "An Interactive Query Langauge for External Data Bases," In Proceedings of the 8th International Conference on Very Large Data Bases, Mexico City, September 1982.

Munson, J.C. & Khoshgoftaar, T.M. (May 1989). "The Dimensionality of Program Complexity," In Proceedings of the 11th International Conference on Software Engineering. Pittsburgh, PA: IEEE Computer Society Press, 1989.

Nielson, G.M. (September 1991). "Visualization in Scientific and Engineering Computation,"

Peitgen, H.-O. & Saupe, D.S., eds. (1988). The Science of fractal images. New York: Springer-Verlag, 1988.

Perry, D.E. & Kaiser, G.E. (January 1990). "Adequate Testing and Object-Oriented Programming," Journal of Object-Oriented Programming, 2 (5): 13-19.

Prieto-Diaz, R. (1990). "Implementing Faceted Classification for Software Reuse," IEEE, pp. 300-304.

Robertson, P.K. (May 1991). "A Methodology for Choosing Data Representations," IEEE Computer Graphics and Applications, 11 (3): 56-67.

Salton, G. & McGill, M.J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill computer science series.

Tarumi, H.; Agusa, K. & Ohno, Y. (April 1988). "A Programming Environment Supporting Reuse of Object-Oriented Software," In Proceedings of the 10th International Conference on

Tsichritzis, D. & Gibbs, S. (January 1992). "Towards Software Communities and Software Clearing Houses," Computing and Control Engineering Journal, 3 (10: 35-42.

Tufte, E.R. (1990). Envisioning Information. Cheshire, CN: Graphics Press, 1990.

Ware C. & Beatty, J.C. (1988). "Using Color Dimensions to Display Data Dimensions," Human Factors, 30: 127-142.

Weger, P. (August 1990). "Concepts and Paradigms of Object-Oriented Programming, OOPS 11 (1): 8-87.

Wood, M. & Sommerville, I. (September 1988). "An Information Retrieval System for Software Components," Software Engineering Journal, 3 (5): 198-207.