10
Graphical Representations of Electronic Search Patterns Xia Lin, Peter Liebscher, and Gary Marchionini College of Library and information Services, University of Maryland, College Park, MD 20740-4345 Increasing numbers of researchers and developers are collecting data that trace the processes of human- computer interactions. Although a variety of methods are used to collect such data, all researchers share common problems in representing and analyzing con- tinuous streams of data. A set of methods and tools for graphically representing the search patterns of users interacting with a variety of databases and hypertexts is under development as part of ongoing research in electronic information seeking. The methodology de- scribed in this article involves defining state spaces of user options within a system, capturing keystrokes or mouse moves as raw data, coding these data into the state spaces, and analyzing user moves through the state space as evidence of goal-directed information- seeking strategies and patterns. A tool for displaying user moves through a state space has been developed and is described. The tool graphically represents data in a variety of formats, allowing human pattern recogni- tion capabilities to be applied for data analysis. Graphs can be represented for individual searches or for multi- ple searches done by either a single searcher or by sev- eral searchers working on the same search task. Introduction Researchers have devoted much time and effort to the study of search behavior in electronic environ- ments. Studies have addressed variables such as individ- ual searcher characteristics or performance, system characteristics or performance, task characteristics, and setting by assessing outcome measures such as time, precision, recall, utility, or the number of cycles or moves. The high levels of interactivity that today’s sys- tems offer have led to the study of the processes and patterns of interaction rather than only the products of such interactions (Blackshaw & Fischhoff, 1988; Cam- pagnoni, & Ehrlich, 1989; Liebscher & Marchionini, 1988; Marchionini, 1989a). To these ends, masses of data are typically collected, such as keystroke logs, ver- bal protocols or observer notes. A major problem for Received March 26,199O; revised May 31,199O; accepted June 5, 1990. 0 1991 by John Wiley & Sons, Inc. researchers is to represent these data in forms that al- low patterns to be easily recognized and analyzed. There are a number of forms in which search pat- terns can be represented and this article describes one method and an emerging tool for graphically represent- ing such patterns. The method involves defining a state space of the possible conceptual moves a user can make with a particular system, capturing user key- strokes or mouseclicks during searches, coding these raw data into the state space, and analyzing the infor- mation-seeking process as a series of conscious moves through the state space. The tool described here is a program that generates a variety of graphs, each of which represents a trace of an information search or group of searches through the state space and which highlights various patterns exhibited by these searches. The purpose of developing this tool was to allow in- formation researchers, system designers, information service providers, and other related information work- ers to manipulate their data in order to easily identify search patterns and to easily display such patterns in reports and presentations. Each search performed by an individual is unique and it requires a great deal of ab- straction to transform searches from the masses of raw data collected into recognizable patterns. A good search pattern representation tool should reveal the pattern for an individual search as well as patterns across a number of searches. The tool should produce representations that highlight salient trends and pat- terns yet lose little pertinent information, it should be sufficiently flexible to allow different forms of repre- sentations for individual searches and to make com- parisons across representations convenient. Graphical Representations Scientific visualization techniques have made signifi- cant contributions to our understanding of complex sci- entific phenomena in a variety of fields (McCormick, DeFanti, & Brown, 1987). The field has become suffi- ciently important to generate a journal entirely dedi- cated to this topic-Pixel: the Magazine for Scientific Visualization-the first issue of which appeared in Jan- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 42(7):469-478, 1991 CCC 0002-8231/91/070469-10$04.00

Graphical representations of electronic search patterns

  • Upload
    xia-lin

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Graphical representations of electronic search patterns

Graphical Representations of Electronic Search Patterns

Xia Lin, Peter Liebscher, and Gary Marchionini

College of Library and information Services, University of Maryland, College Park, MD 20740-4345

Increasing numbers of researchers and developers are collecting data that trace the processes of human- computer interactions. Although a variety of methods are used to collect such data, all researchers share common problems in representing and analyzing con- tinuous streams of data. A set of methods and tools for graphically representing the search patterns of users interacting with a variety of databases and hypertexts is under development as part of ongoing research in electronic information seeking. The methodology de- scribed in this article involves defining state spaces of user options within a system, capturing keystrokes or mouse moves as raw data, coding these data into the state spaces, and analyzing user moves through the state space as evidence of goal-directed information- seeking strategies and patterns. A tool for displaying user moves through a state space has been developed and is described. The tool graphically represents data in a variety of formats, allowing human pattern recogni- tion capabilities to be applied for data analysis. Graphs can be represented for individual searches or for multi- ple searches done by either a single searcher or by sev- eral searchers working on the same search task.

Introduction

Researchers have devoted much time and effort to the study of search behavior in electronic environ- ments. Studies have addressed variables such as individ- ual searcher characteristics or performance, system characteristics or performance, task characteristics, and setting by assessing outcome measures such as time, precision, recall, utility, or the number of cycles or moves. The high levels of interactivity that today’s sys- tems offer have led to the study of the processes and patterns of interaction rather than only the products of such interactions (Blackshaw & Fischhoff, 1988; Cam- pagnoni, & Ehrlich, 1989; Liebscher & Marchionini, 1988; Marchionini, 1989a). To these ends, masses of data are typically collected, such as keystroke logs, ver- bal protocols or observer notes. A major problem for

Received March 26,199O; revised May 31,199O; accepted June 5,

1990.

0 1991 by John Wiley & Sons, Inc.

researchers is to represent these data in forms that al- low patterns to be easily recognized and analyzed.

There are a number of forms in which search pat- terns can be represented and this article describes one method and an emerging tool for graphically represent- ing such patterns. The method involves defining a state space of the possible conceptual moves a user can make with a particular system, capturing user key- strokes or mouseclicks during searches, coding these raw data into the state space, and analyzing the infor- mation-seeking process as a series of conscious moves through the state space. The tool described here is a program that generates a variety of graphs, each of which represents a trace of an information search or group of searches through the state space and which highlights various patterns exhibited by these searches.

The purpose of developing this tool was to allow in- formation researchers, system designers, information service providers, and other related information work- ers to manipulate their data in order to easily identify search patterns and to easily display such patterns in reports and presentations. Each search performed by an individual is unique and it requires a great deal of ab- straction to transform searches from the masses of raw data collected into recognizable patterns. A good search pattern representation tool should reveal the pattern for an individual search as well as patterns across a number of searches. The tool should produce representations that highlight salient trends and pat- terns yet lose little pertinent information, it should be sufficiently flexible to allow different forms of repre- sentations for individual searches and to make com- parisons across representations convenient.

Graphical Representations

Scientific visualization techniques have made signifi- cant contributions to our understanding of complex sci- entific phenomena in a variety of fields (McCormick, DeFanti, & Brown, 1987). The field has become suffi- ciently important to generate a journal entirely dedi- cated to this topic-Pixel: the Magazine for Scientific Visualization-the first issue of which appeared in Jan-

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 42(7):469-478, 1991 CCC 0002-8231/91/070469-10$04.00

Page 2: Graphical representations of electronic search patterns

uary of 1990. Pictorial representations are both effec- tive and efficient means for representing the world. According to Bertin (1983), “. . .graphics is one of the major ‘languages’ applicable to information processing” (p. 2). The presentation of research results in graphical form (e.g., histograms, pie charts, etc.) is widely ac- cepted by researchers in the sciences and social sci- ences on the principle that summaries of large amounts of data are more easily assimilated by readers in graphi- cal rather than in textual or numeric form. “Even for small sets of data, there are many patterns and relation- ships that are considerably easier to discern in graphi- cal displays than by any other data analytic method.” (Chambers et al., 1983, p. 1). Bertin (1983), views this in terms of efficiency where a more efficient construction requires a shorter period of perception than a less effi- cient one. He cast this in terms of “. . . Zipf’s notion of “mental cost,” applied to visual systems” (p. 9).

Journalists routinely use bar charts, line charts, pie charts, and other graphs to present numerical data, be- cause they believe that these representations are effec- tive in attracting reader attention and in making statistical information more palatable (Tankard, 1987). Educators have found that picture-word diagrams are, more effective than text alone for the comprehension of science concepts (Winn, 1980), and that graphical rep- resentations direct learners’ attention to the importance of concepts, hence increasing learning (Reynolds & Baker, (1987). Psychologists suggest that visual imagery is often superior to verbal presentations in aiding mem- ory and thinking (Phillips, 1986).

Larkin & Simon (1987) offer a structural distinction between sentential and diagrammatic representations. Accordingly, sentential representations are linear struc- tures in which each element is adjacent only to the next element in the list, while diagrams represent informa- tion by location in a plane, where each element may be adjacent to any number of other elements, and many elements may share the same location. The Larkin and Simon model has been applied successfully to demon- strate several advantages of graphical over verbal repre- sentations.

(1) graphical representations can group large amounts of information that are used together;

(2) graphical representations explicitly preserve the information about the types of relations among entities;

(3) graphical representations support a large number of perceptual inferences which are extremely easy for humans.

A graphical representation of search data can be pro- duced through a transformation of keystroke data or coded verbal protocols. These transformed data “can be better representations not because they contain more information but because the indexing of this informa- tion can support extremely useful and efficient compu- tational processes” (Larkin & Simon, 1987, p. 99).

These data support statistical analyses as well as human visual processes. Given a computer system that has both graphical and database capabilities, coded key- stroke data or verbal protocols can be transformed into graphs. Researchers can view and compare these graphs and generate hypotheses. These hypotheses can then be tested through post hoc statistical analyses or may lead to testing through new studies.

Representations of Electronic Information Search Patterns

In terms of search patterns, graphical representa- tions have, until recently, had only limited use. Black- shaw and Fischhoff’s (1988) study of decision making in online searching is a recent example of the use of both graphical and verbal presentation of data. They believe that graphical representations “can provide useful in- sights into subjects’ decision-making behavior. They [the representations] offer sufficient detail to allow underlying processes to emerge, but in a sufficiently standarized way to allow comparisons and statistical summaries” (p. 383).

Representations of electronic searches can be made mainly by defining states for the search system being used, collecting data (e.g., keystroke logs, verbal proto- cols, etc.), mapping data to search states, and displaying sequences of search states. This article describes meth- ods for defining states, mapping data to states and dis- playing the results in the form of various graphical representations.

Search State Spaces

A trend in electronic search analysis is to define a state space that accounts for the search process, i.e., each move by the searcher results in a transformation from one system state to another. A state space can be defined for a system by conducting a task analysis to generate a mutually exclusive and exhaustive set of op- tions the system offers to users.* From the system’s point of view, a state may be a type of screen display. From the user’s point of view, a state may be a situation that requires a decision or action. States may be defined at physical (e.g., screen display or keypress) or concep- tual (e.g., decision type or component of a model of the search process) levels. Moreover, state spaces can be defined at coarse or fine levels of granularity. Nicker- son (1969) pointed out that choice of states is one of the

*It is important to note that although the options a user has at

any given time must be mutually exclusive (each option has a unique

physical command or selection attached to it), the results of apply-

ing those options need not be; i.e., several options can yield the

same results (the mapping of the state space onto itself is not a

one-to-one function). This is particularly so in highly flexible sys-

tems such as HyperCard, that provide most or all possible options

to the user at all times.

470 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991

Page 3: Graphical representations of electronic search patterns

greatest problems in describing the search process. For our state spaces, we have typically defined states as the options users may select, thus mapping user moves as states. In most cases these options are conceptual such as “use an index” or “examine text”.

Penniman (1975) reviewed five other state space definitions of online searching before describing his own. His approach was to select a set of states (Table 1) that reflect the user actions with an associated system function implied. This approach has influenced other researchers who have studied electronic searching (Chapman, 1981; Marchionini, 1989a; Tolle & Hah, 1985).

In our work, we define states from the user’s per- spective, each state representing a possible conceptual move (decision) the searcher may take. The power of the state space approach lies in the explicit definition of, not only all possible states, but also of all operators that allow state transitions to take place. Users select from operators to move from state to state according to some currently active goal or subgoal. This view of hu- man cognition (Newell & Simon, lY72) has resulted in formal models for explaining human performance with complex systems (e.g., the GOMS model, Card, Moran, & Newell, 1983; Kieras, 1988).

A state space can be more or less detailed depending on the interests of the researcher. In simplest form, states can be defined by unique screen displays; in more typical form, states represent conceptual situations such as assessing relevance of an information unit. Likewise, operators can be mapped at simple levels such as keystroke, or at more meaningful levels such as narrow- ing of a query statement. Because there are no stan- dards as to what states are of interest for electronic search systems and how these should be coded, state spaces must be hand-crafted for each system and each set of research questions. Two examples of state space mappings are described and discussed below. The list of states used for a particular system and data set is termed a state map.

Table 2 shows a state map for Grolier’s Electronic Encyclopedia. This system offers users the choice of

TABLE 1. Penniman’s state map for the BASIS system.

Penniman’s States for Online Searching

State Description

A Begin Session B Request Data base

C Search Index D Logic formation E Off-line print

F Online Display

G Review search H Review commands

1 Set parameters (output format) J Exit Date base K End session

searching by title using a title index (Codes l-7) or do- ing a Boolean word search (Codes 8-37). Note that there is some redundancy since Codes 2-7 are repeated as Codes 32-37. This was done because it was believed that title searches are strategically different from Boolean word searches and that the graphs and analy- ses should highlight these differences. In principle, if the graphs that are generated according to these state maps are to have maximum visual impact, anticipated important effects should be separated as widely as pos- sible in the coding scheme. Such coding is an example of state maps reflecting the research questions and hy- potheses of the researcher for a particular study. An- other study, even if the same system is used, might well necessitate a different state map.

In keeping with our research interests in this par- ticular study (Marchionini & Liebscher, in preparation), this state map is very fine grained. We were interested in diagramming query formulations over a number of cycles and sessions to determine user patterns of broad- ening or narrowing in response to a set of retrieved ti- tles (Codes 11-30). If other aspects of the search process were under investigation in this study, the gran- ularity of the state map might have been much coarser. We may, for example, have been interested only in the query/examine sequence, e.g., the extent of text exami- nation between query formulations. If so then state codes 11-30 could be collapsed to a single state.

Table 3 is a state map for the HyperCard version of “Hypertext on Hypertext” (ACM, 1988) that was used for a study on how different types of knowledge affect searching behaviors (Marchionini, Lin, & Dwiggins, 1990). This state map is much coarser than the previous one, focusing on user navigational actions. The system supports buttons (links) for three general information- seeking strategies: index searching, full-text word searching, and browsing. The system provides two types of index, text (index button) and graphic (map button), supports word search through a “find” button, allows linear text scrolling for linear browsing, and sup- plies iconic link markers in the text for nonlinear browsing. In this state map these buttons are coded and ordered into four groups (querying, indexing, linear browsing, and nonlinear browsing) because the re- search questions led us to look for patterns in the strate- gies used by different sets of users.

In the Marchionini, Lin, and Dwiggins study, search data were graphed according to the state map in Table 3 to augment findings from mouseclick statistics and ver- bal data. For example, the graphs showed that little nonlinear browsing was attempted, which then was veri- fied by frequency count data. More interestingly, the graphs indicated that the searchers often changed from one search strategy to another. This led us to carefully examine the audiotaped verbal reports for explanations of why such changes were made.

It seems that graphic displays based on state spaces such as those described offer the right granularity

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991 471

Page 4: Graphical representations of electronic search patterns

TABLE 2. State map for the Grolier Electronic Encyclopedia.

State Code Explanation

1

2 3 4

11

12

13

14

15

16

17

18

19

20 21 22

23 24 25 26

27 28 29 30

31

32 33 34

35 36 37

38

Query System invites user to enter a title

Titles List of titles (first screen)

Next screen Next screen of titles

Back screen Previous screen of titles

Text First screen of article

Next screen Next screen of article

Back screen Previous screen of article

Begin search

New query System invites user to enter new query

Edit query System invites user to edit existing query

No change Existing query is kept intact

One word or phrase query, no truncation

One word or phrase query, truncation

Multiple word query, terms ORed, no truncation Multiple word query, terms ORed, truncation

Two words or phrases ANDed, no truncation, no OR

Two words or phrases ANDed, truncation, no OR

Two words or phrases ANDed, terms ORed, no truncation

Two words or phrases ANDed, terms ORed, truncation

Three words or phrases ANDed, no truncation, no OR

Three words or phrases ANDed, truncation, no OR Three words or phrases ANDed, terms ORed, no truncation

Three words or phrases ANDed. terms ORed, truncation

Four words or phrases ANDed, no truncation, no OR

Four words or phrases ANDed, truncation, no OR Four words or phrases ANDed, terms ORed, no truncation

Four words or phrases ANDed, terms ORed, truncation

Five words or phrases ANDed, no truncation, no OR

Five words or phrases ANDed, truncation, no OR

Five words or phrases ANDed, terms ORed, no truncation

Five words or phrases ANDed, terms ORed, truncation

Report System reports number of titles retrieved

Titles List of titles (first screen)

Next screen Next screen of titles

Back screen Previous screen of titles

Text First screen of article

Next screen Next screen of article

Back screen Previous screen of article

End search -

TABLE 3. State map for the HyperCard version of “Hypertext on Hypertext.”

Code Section State

0. 1.

2. 3. 4. 5. 6.

I. 8. 9.

10.

11.

12.

13. 14.

15.

querying

indexing

Home Find

Index searching

Map searching (another form of indexing)

Go to Abstracts Table of Contents Go to first card of each article

Linear browsing

Nonlinear browsing

Go to Next stack (next article)

Next card

Previous card

Return to previous

Go to reference Reference back to article

Boldface words (from text to indexes)

Link markers

Exit

472 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991

Page 5: Graphical representations of electronic search patterns

between primitive fine-grained keystroke/mouseclick data and the richness of verbal reports. Keystroke/ mouseclick data describe a search syntactically at best, graphic data represent a search concisely and abstractly, and verbal data reveal the semantics of the search pro- cess. Processing these multiple kinds of data allows re- searchers to triangulate (Williams, Rice, & Rogers, 1988) multiple data sets that are all focused on the same research questions.

Electronic Search Data

Data from electronic information searches are com- monly in the form of verbal protocols or keystroke logs. Typically, verbal protocols are records of a searcher’s “thinking aloud” while searching, notes of a researcher’s observations of the search, or notes of follow-up inter- views that are intended to probe a searcher’s thinking and reasoning during the search (see Ericsson and Simon (1984) for a review of use of verbal protocols). Unlike keystroke logs, verbal protocols can be analyzed with or without state space descriptions because verbal descriptions themselves convey meaning to researchers.

A variety of utilities or programs have been devel- oped to log user interactions with information systems. In our work we have used macro definition utilities, specially designed terminate-stay-resident programs, vendor-supplied logging software, and customized ver- sions of retrieval software to capture and timestamp keystrokes and mouse moves. Raw keystroke data (key- presses or mouseclicks) are objective records of the search process, recording every signal sent to the sys- tem. Some logs also record system responses. With few exceptions (e.g., PgUp key, PgDn key, etc.), these data require further transformation to convey meaning to the researcher. Because they have no intrinsic mean- ing, keystroke data must be aggregated and abstracted before they can be interpreted. One approach is to code them into state maps and summarize them statistically. Another approach is to display the coded state transi- tions graphically to use human pattern recognition ca- pabilities to find patterns and make comparisons.

Data gathered during electronic searches can be in- terpreted from a number of perspectives. Cooper (1983) analyzed the usage of various commands’ to provide quantitative evidence for understanding what takes place during a search. Penniman (1975) focused on the interaction aspect of online searching in order to de- scribe similarities and differences across various users and databases. Blackshaw and Fischhoff (1988) tried to charactierize decision-making processes during online searching to discern how people assess the uncertainty surrounding their decisions. Fidel (1985) described on- line searching in terms of searchers’ cognitive styles- operationalist and conceptualist. Marchionini (1989b) and Liebscher and Marchionini (1988) sought to differ- entiate searches by information seeking strategies- browsing and analytical.

In all these cases, graphical representation could be helpful for data analysis. For example, a graph can be produced for each search in order to see more clearly what types of commands were used and how these commands were clustered. Graphs can reflect both the user’s and the system’s actions, revealing characteristics of the interaction between the user and the system. Graphs also facilitate comparisons across searches. This allows analysis of searching styles and strategies through visual displays of searches.

Graphical representations alone provide only de- scriptive, summary analysis of the data. Once a state- map for a graphical representation has been constructed, the data can also be used for statistical analyses, both descriptive and inferential. For example, transition ma- trices can be developed to compare users, systems, or tasks (e.g., Borgman, 1986, Chapman, 1981; Marchion- ini, 1989a; Penniman, 1975; Tolle & Hah, 1985).

Development of a Graphical System for Electronic Search Patterns

Graphs of search patterns can be programmed on an ad hoc basis as the need arises. However, if the re- search is ongoing, a general graphing tool is desirable, not only on grounds of efficiency, but also for maintain- ing some display standards across a number of research projects. A graphing tool, Graphic Representor of Inter- action patterns (GRIP), that allows easy redefinition of state maps and is capable of displaying a variety of graphs is under development by the authors.

GRIP is currently based on HyperCard, an informa- tion organizer that integrates the capabilities of a data- base, a painting tool and a programming language. Easy data manipulation, dynamic graphing capability, power- ful programmability, and a flexible interface together contributed to the choice of HyperCard as an appropriate system for producing graphical representations.

Types of Graphical Representations

Because GRIP is an evolving tool and is being used for a variety of experiments, the discussion intertwines descriptions of the different ways the tool can be used with data from selected investigations and systems. Our eventual goal is to develop a tool that allows dynamic representations to be compared and analyzed as ani- mated “movies of the data”. The present tool produces a variety of static displays that are proving to be useful augmentations to the data analyses underway.

Line Graphs. A simple representation of an elec- tronic search can be a two dimensional curve. Here each conceptual move made by the searcher is plotted as a unique point in a two dimensional space. Concep- tual states are represented along the Y axis and the length of time spent in each state (time between moves) is represented on the x axis. Lines linking successive

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991 473

Page 6: Graphical representations of electronic search patterns

points represent the transition from state to state and the complete curve represents the movement through the state space. Figure 1 depicts searches conducted by three subjects for the same simple, fact retrieval ques- tion in Grolier’s Electronic Encydopedia. Note that not all states are labeled on the vertical axis due to resolu- tion limitations for fine-grained state maps. In Figure 1, the begin point defines a horizontal axis from which all coding begins, thus title searching proceeds up from

FIG. 1. Line graphs of an electronic search. The three graphs represent searches conducted by three searchers for the same

question.

this axis and word searching proceeds down from this axis, For these graphs, the horizontal dotted lines di- vide the state space into four clusters of moves repre- senting, title searches, initiation of a search, query formulations in word searches, and examinations of title sets or article texts.

Advantages of graphical representation emerge even in this simple form. The continuity of the graph gives a clear image of how a search starts, what function is se- lected at each step, and how the search ends. Subject 102 (Fig. l(a)) conducted a title search as follows: en- tered a title, viewed the list of titles alphabetically near that title, entered a second title, viewed the resulting title list. paged forward in the list, paged back, and viewed the first page of an article before initiating a word search. The subject then examined a set of article titles for articles that contained the words in the query and selected an article to read, scanned the first page, returned to the list of hits, then looked at the query he/she had formulated, and decided to quit. Subjects 103 and 110 conducted remarkably similar searches, each entering a query in a word search mode and select- ing a single article to read. Subject 110 checked the set of titles again before quitting. This rather simple ex- ample illustrates how displaying the results of different searches for the same question can illustrate different or common patterns.

Other patterns may be revealed when comparing the graphs of different questions being searched by the same searcher. and search command cycles can be eas- ily identified. This is useful if, for example, the re- searcher has hypotheses about the relationship between the number of command cycles in a search and “good- ness” of search-it may be that many cycles indicate a poor search. None of these patterns can be as easily seen from raw data or from verbal or numerical summaries.

There are some weaknesses in such simple line rep- resentations as those in Figure 1. First, because all states are represented equally, i.e., by a point in a plane, it is necessary to refer to axis labels and legends to interpret the state. Second, because state changes are represented by lines between points, the impression is given that the graphs represent a continuous series of moves rather than a set of discrete transitions from state to state. This common problem of representing nominal data on an interval scale has led to some mis- understandings of the graphs. Finally, interpreting the amount of time users stayed in particular states is diffi- cult with these graphs. The slope of a line segment con- necting two states is due to both the nominally scaled states on the vertical axis and the ratio-scaled time spent in the state (horizontal distance). Since the pur- pose of producing graphs is to allow quick and easy vi- sual interpretaton of searches, these line graphs provide only one view of the data.

Icons. To alleviate some of the problems of the line graphs and to further enhance the representations, a

474 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991

Page 7: Graphical representations of electronic search patterns

second type of graph was developed using icons to rep- resent states. The use of icons is an efficient way to enrich an interface and make it more directly manipu- lable (Shneiderman, 1987; Veith, 1988). Likewise, icons can enrich representations of the search process by adding another channel for abstracting data about states, sequence and time. Additionally, when used in electronic search representations, it is possible to have icons maintain their meanings across different systems, thus improving the portability of the representations.

Ideally, icons should be used that have some intrinsic mappings to the states they represent, e.g., the trash can in the Macintosh system. This is difficult to achieve in practice, especially for conceptual states. For simplic- ity, the icons used here are simple geometric figures. Figure 2 shows the same search represented in three forms: as a line graph (Fig. 2(a)), as an icon graph (Fig. 2(b)) and as a bar graph using icons (Fig. 2(c)).

Figure 2(b) is the type of representation we are using to analyze individual searches. In such graphs, the number of icons in each state represents the time a searcher spends in that state (in this graph, each icon represents ten seconds). The icons themselves are matched to the four clusters of states of interest in this study (title searches, initiation of a search, query formu- lations in word searches, and examinations of title sets or article texts). Compared to the line graph in Fig- ure 2(a), the iconic graph contains considerably more information. The nominal scaling of the data becomes more visually apparent and time in a state is explicitly represented as horizontal distance thus making the graph more useful in identifying global patterns. Of course, the meanings of the icons must be learned so that the graph can be interpreted in detail.

GRIP is designed to lend flexibility to the graphical representations so that the same search may be repre- sented differently depending on the preferences of the researcher. An obvious approach is to produce several graphic versions of a search. Figure 2(c) shows another version of the previous graphical representation. Time is represented vertically by the number of icons, and the horizontal axis simply represents the sequence of moves.

Multiple Representations for Searches. Another use of graphic representation is to display two or more searches together. GRIP will display multiple searches either by searcher or by question. This allows easy com- parison to be made between two or more searches. Figure 3 shows the searches of three subjects working on the same question in a Hyperties database.

The state map for this hypertext was designed to separate actions like using an index from browsing the hypertext through links and these two extremes are la- beled on the vertical axis. One subject conducted the search by browsing exclusively, one by using the index exclusively, and one used both strategies.

Another way to apply GRIP across multiple ques- tions or subjects is illustrated in Figure 4. Figure 4(a)

(a)

lnder

Beg,” 0 - - ,--AffR------=

II /I- - - - I: I, II

--

8, IN II

------

-1 Em b

(b)

Cc)

FIG. 2. Three graphs of a single search. (a) Line graph (b) Icon graph (c) Bar graph

shows four searches done by a single subject in Grolier’s Electronic Encyclopedia. Figure 4(b) shows the four searches done by another subject for the same ques- tions. Subject 110 (Fig. 4(b)) shows extraordinarily stable search patterns, finding a set of articles quickly and browsing the list and texts without reformulating the query. Subject 102 (Fig. 4(a)), on the other hand, did one search completely as a title search and three as

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991 475

Page 8: Graphical representations of electronic search patterns

FIG. 3. A comparison of three searches for the same question.

(a)

(b)

FIG. 4. Comparison of two searchers. Four searches are graphed

for each. Both searchers worked on the same four questions.

(a) Searcher 102 displays flexibility in selecting various system

functions. (b) Searcher 110 displays extraordinarily stable patterns

across all four questions.

word searches with multiple query reformulations. It is important to note that the graphic representation does not explain why these effects occur, but rather helps the researcher identify effects that should be pursued and corroborated with other data.

The Grip Tool

GRIP allows researchers to select which graphical representation is best while only coding the data once. This permits us to observe patterns that may be obscure when displayed only in one form and to compare search patterns across subjects and search tasks. Figure 5 dis- plays the GRIP reference screen to summarize the OT-

ganization and functionality of the system. When using the system, each search is represented by

a screen and each screen is divided into three areas. The top area is for identification data such as codes for sub- jects, search tasks, and subject groups. The middle area is for representing the raw data (keystroke or mouse- clicks and times), an abbreviated labeling of the state map, and the graph itself. The graph and the raw data occupy the same visual space and both can be displayed or hidden independently or together. The raw data are stored in GRIP so they can be edited without leaving the tool. Automatic importation of data from ASCII logs is presently accomplished by custom programming.

The third area includes an icon menu of all func- tional buttons. There are five graph buttons and five operational buttons. The graph buttons represent the data as line (e.g., Figs. 1, 2(a)), icon, (e.g., Fig. 2(b)), bar (e.g., Fig. 2(c)) a pair of bar graphs, and any num- ber of multiple graphs (e.g., Fig. 3 or 4). The data but- ton hides or displays the raw data, the code button hides or displays the coded data, and the map button displays the state map. The sort button sorts the cards by subject, question, or group, and the “slide show” icon button will provide a rapid display of the stack of cards, one at a time. At present, this is the only dy- namic representation possible. We are exploring ways to show multiple graphs as minatures in both static and dynamic ways so that “movies” of large data sets can be examined. For example, we are developing animated graphs to represent individual searches in a variety of simulated time scales and to compare several searches concurrently. Another line of exploration is the cre- ation of “meaningful” icons for both the static and dy- namic displays.

FIG. 5. The GRIP reference card.

476 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991

Page 9: Graphical representations of electronic search patterns

Applications and Limitations of GRIP

As part of our ongoing work on information seeking in electronic environments, state spaces have been de- veloped and GRIP employed for studies that used on- line bibliographic systems (DIALOG online and ABI Inform on CD-ROM), the DISCLOSURE database on CD-ROM, the CD-ROM versions of Grolier’s Electronic Encyclopedia and McGraw Hill’s Encyclopedia of Sci- ence and Technology, and a variety of hypertext data- bases in both HyperCard and Hyperties forms. It should be noted that since systems vary widely, GRIP comparisons across systems are tenuous unless a com- mon state space is defined for the systems. This has proven difficult except through rather coarse and ab- stract state maps. Across system comparisons for similar systems are, of course, less difficult than comparisons between Boolean-oriented bibliographic systems and full-text hypertext systems. In the latter case, compari- sons can only be made at gross levels of detail.

Some of the research problems for which GRIP is currently being used include: search pattern types (browsing and analytical, (Liebscher & Marchionini, 1988; Marchionini & Liebscher, in preparation), system usability and learnability (disorientation, distraction, errors, learning effects, (Marchionini, 1989b, Mar- chionini, 1990), effects of different access methods on search in hypertext (Liebscher, 1991), and the roles of task domain knowledge and information-seeking knowledge on search performance (Marchionini, Lin, & Dwiggins, 1990). In all of these investigations, GRIP is used to complement other methods and approaches.

Consumers of these graphical representations must be warned that the graphs do not have some inherent shape or meaning. Rather, by their position in a state map, a researcher highlights states thought to be more important and minimizes those that are thought less important. Depending on the researchers’ points of view, state maps for the same search system can be coded in several ways, each of which may be equally “good.” A limitation of using the graphs themselves is the complexity of graphs of five or more searches. It can be confusing to see multiple, overlapping lines all at once. Different line types (e.g., broken, iconic, etc.) do not seem to help beyond the four or five search limit. Color capability may extend this limit and will be examined when HyperCard supports color.

Our graphical representations of search data do not stand alone. We do not argue that graphical representa- tions take the place of other forms of presenting and analyzing data, but rather that many forms of represen- tation and analysis should be considered in any study. Graphical representations as presented here, while pow- erful tools, are but one of many that can be fruitfully employed. Data from searching can be described in terms of numbers, words, or graphs and different types of descriptions may illustrate the process from different viewpoints. Whenever possible, various forms of search

representations should be available for people who need to analyze the search process.

Summary

If creative thinking is “seeing” things in a new or dif- ferent light, for example, a system that can help assemble these new or different “pictures” or juxtaposing of in- formation, may indeed represent a dramatic change in the capabilities of information systems. (Veith, 1988, p. 214)

A tool that enables researchers to view their data in many forms and at different juxtapositions may well al- low them to “see” what might otherwise not have been seen. GRIP is used in our research to provide alternative sets of views of the data. These views are integrated with other views afforded by statistical analyses of search log data, interpretations of verbal reports and researcher ob- servation notes, and assessments of actual products re- sulting from the conduct of searches. In our use of graphical representation we have been careful to ground our graphs in the research questions that our studies sought to answer. This has made it possible to see pat- terns that would have been difficult to spot otherwise.

While useful to researchers for analyzing data, graphical representations of search patterns are also ef- fective for presenting the results of research efforts to readers-they serve both an analytical and a presenta- tional purpose. For example, the differences in patterns apparent in Figure 2(a) and 2(b) are obvious and force- ful when compared to a paragraph of textual descrip- tion or a table of sequential moves. We argue that, in this case, the graphic is also more efficient than other forms of representation.

Our purpose for generating graphic representations can be summarized as

(1) Representing the search in abstract forms without distorting the data;

(2) Revealing the data at unique levels of detail and abstraction, and to facilitate comparisons at these levels;

(3) Facilitating visual analysis of data and search pat- terns.

For many disciplines, graphical representations have become standard instruments for understanding and communicating research results. Graphic representa- tions of searches have provided us with insights into the information-seeking process and served as points of departure for analyses of verbal reports and other data. As with other forms of scientific visualization, these rep- resentations help us understand a phenomenon and build and test theories about it. Additionally, just as a well-constructed abstract presents the essence of an ar- ticle, so graphical representations are able to present simplifications of highly complex systems and processes.

We have presented a fairly crude form of graphical representation for electronic search patterns. Other rep-

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991 477

Page 10: Graphical representations of electronic search patterns

resentations, including “movies” of search patterns, are possibilities. The use of color will add yet a further di- mension and automatic recoding of data to fit a number of alternative state maps would lend great variety and flexibility to our representations. The use of graphs to represent search patterns is still in its infancy. As part of our research effort, we are constantly seeking new graphical methods of analyzing and presenting our data. We have discussed some of our ideas here, how- ever, much more research is needed before we can take full advantage of the possibilities.

Acknowledgments

The authors wish of Valerie Florance

to acknowledge the contributions who provided helpful comments

and suggestions in the preparation of this article. Sup- port for this research was provided by the National Sci- ence Foundation (Grant IRI-8718075) and by the Council on Library Resources (Grant 5005).

References

Anderson, J. R. (1978). Arguments concerning representations for

mental imagery. Psychological Review, 85, 249-277. Association for Computing Machinery. (1988). Hypertext on hyper-

text: Special section of the July, I988 Communication of the ACM. Bates, M. J. (1979). Information search tactics. Journnl of the Amer-

ican Society for Information Science, 30, 203-214.

Bertin. J. (1983). Semiology of graphics. Madison: University of

Wisconsin Press.

Blackshaw, L. & Fishhoff, B. (1988). Decision making in online

searching. Journal of the American Society for Information Science. 39, 369-389.

Borgman, C. (1986). The user’s mental model of an information

retrieval system: An experiment on a prototype online catalog.

International Journal of Man-Machine Studies, 24, 47-64. Campagnoni, F. R., & Ehrlich, K. (1989). Retrieval using a hyper-

text-based help system. ACM Transactions on Information Sys- tems. 7 271-291.

Card, S., Moran, T., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum

Associates.

Chambers, J. M., Cleveland, W. S., Kleiner, B., and Tukey, P. A.

(1983). Graphcial methods of data analysis. Belmont, CA:

Wadsworth International Group.

Chapman, J. L. (1981). A state transition analysis of online infor-

mation seeking behavior. Journal of the American Society for In- formation Science, 32, 325-333.

Cooper. M. D. (1983). Usage patterns of an online search systems.

Journal of the American Society for Information Science, 34, 343- 349.

Ericsson, K. A., & Simon, H. A. (1984). Protocol analysis: verbal reports as data. Cambridge, MA: MIT Press.

Fidel, R. (1985). Moves in online searching. Online Review, 9, 61- 74.

Kieras, D. (1988). Towards a practical GOMS model methodology for user interface design. In M. Helander (Ed.); Handbook of human-computer interaction (pp. 135-157). Amsterdam: North-

Holland.

Larkin, J. L., & Simon, H. A. (1987). Why a disgram is (sometimes)

worth ten thousand words. Cognitive Science, 11, 65-100. Liebscher, P. (1991). Information seeking in hypertext: multiple ac-

cess methods in a full-text hypertext database. Unpublished disser-

tation proposal, University of Maryland. Liebscher, P., & Marchionini, G. (1988). Browse and analytical

search strategies in a full-text CD-ROM encyclopedia. School Li- brary Media Quarterly. Summer, 223-233.

Marchionini, G. (1990). Self-directed learning through hypermedia: assessing the process. Paper presented at the American Educa-

tional Research Association Annual Meeting, April, 1990,

Boston, MA.

Marchionini, G. (1989a) Information-seeking strategies of novices

using a full-text electronic encyclopedia. Journal of the American Society for Information Science, 50, 54-66.

Marchionini, G. (1989b) Making the transition from print to elec-

tronic encyclopedias: Adaptation of mental models. International Journal of Man-Machine Studies, 30, 591-618.

Marchionini, G., & Liebscher, P. (in preparation). Searching and

browsing in electronic full-text databases.

Marchionini, G. Lin, X., & Dwiggins, S. (1990). Effectsofsearch and subject expertise on information seeking in a hypertext environment, Paper presented at the American Society for Information Sci-

ence Annual Meeting, November, 1990, Toronto.

McCormick, B., DeFanti, T., & Brown, M. (Eds.) (1987). Visual- ization in scientific computing. Computer graphics (ACM

SIGGRAPH), 21(6).

Newell, A., & Simon, H. A. (1972). Human problem solving. Engle-

wood Cliffs, NJ: Prentice-Hall. Nickerson, R. S. (1969). Man-computer interaction: A challenge for

human factors research. Ergonomics, 12, 501-‘517.

Penniman, W. D. (1975). Rhythms of Dialogue in Human-computer conversation. Unpublished doctoral dissertation, The Ohio State

University. Phillips, R. J. (1986). Computer graphics as a memory aid and a

thinking aid. Journal of ComputerAssisted Learning, 2, 37-44. Reynolds, R. E., & Baker D. R. (1987). The utility of graphical rep-

resentations in text: some theoretical and empirical issues. Jour-

nal of Research in Science Teaching, 24, 161-173.

Shneiderman, B. (1987). Designing the user interface; strategies for effective human-computer interaction. Reading, MA: Addison-

Wesley.

Tankard, J.W. (1987). Quantitative graphics in newspapers. bur-

nalism Quarterly, 64, 406-415. Tolle, J.E., & Hah, S. (1985). Searching Online search patterns:

NLM CATLINE Databases. Journal of the American Society for Information Science, 36, 82-93.

Veith, R. (1988). Visual information systems: The power of graphics and video. Boston, MA: G. K. Hall.

Winn, W. (1980). The effect of block-word diagrams on the struc-

turing of science concepts as a function of general ability. Jour-

nal of Research in Science Teaching, 17 201-212. Williams, F., Rice, R. E., & Rogers, E. M. (1988). Research methods

and the new media. New York: Free Press.

478 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-August 1991