19
MONITORING AND EVALUATION OF ON-LINE INFORMATION SYSTEM USAGE W. D. PENNIMAN Research Department, OCLC, Inc., 1125 Kinnear Road, Columbus, OH 43212, U.S.A. and W. D. DOMINICK COmPUter ScienceDepartment, University of Southwestern LouisianaLafayette,LA 70504 U.S.A. (Received March 1979) Abstract-The paper presents a background survey of the existing state-of-the-art as it relates to monitoring of information systems. It addresses both historical and current approaches and both manual and automated techniques. The general concept of automated monitoring into a well-defined methodology, categorizing the generic uses for monitoring, identifying specific objectives of monitoring and translating these objectives into detailed parameters are developed. Methodologies, techniques and theoretical foundations essential for analyzing monitored data are formulated. Desirable computer-based support requirements for data analysis also are discussed. Conclusions and implications for future research and development efforts in monitoring and evaluation of on-line information systems are highlighted. 1. INTRODUCTION In recent years, information systems have rapidly changed in response to increased data management and access demands placed upon them by business, government and academia. Often, the changes to information systems are necessitated after a system becomes operational. If information systems are to be truly responsive to users’ needs, change itself must be considered in the process of systems design. Systems design can no longer be viewed as a one-time effort resulting in a static design that is unchanging for the operational life of the system. Rather, it must be predicated on the principle that change of some sort is inevitable and, in fact, desirable. The problem is, however, determining what must be changed, when it must be changed and what it shodd be changed to. In this light, facilities for measuring and evaluating system capabilities and effectiveness assume a role of critical importance. AIthough it may be very apparent that a problem exists within an information system environment, it usually is anything but apparent what the solution to that problem might be. For example, if system response time is at an unacceptable level, the problem (poor response time) will be obvious to every system user. What is not obvious is how to solve the problem, i.e. how to isolate and identify the bottlenecks and then eliminate them. Analogously, it may be quite clear from rumors, complaints, or discussions with specific users that users are dissatisfied with the information system. This dissatisfaction, however, cannot be alleviated unless the system staff can isolate the reasons for the dissatisfaction, e.g. which system capabilities are too difficult to use, which user Ianguage constructs are too error prone, which display formats are unacceptable due to content (too much or too little) or aesthetic reasons. The intent of this paper is to survey approaches to problems of this nature and to present the underlying concepts of automated monitoring mechanisms within information systems. These concepts are presented as a viable solution to the problem area, both for present and future systems. The two concepts of prime importance within this paper are the concepts of “monitoring” and “evaluation”. ~o~~~o~~g is defined as the process of collecting data associated with the functioning and/or usage of a system. ~vafuati~~ is defined as the process of analyzing the 17

Monitoring and evaluation of on-line information system usage

Embed Size (px)

Citation preview

Page 1: Monitoring and evaluation of on-line information system usage

MONITORING AND EVALUATION OF ON-LINE INFORMATION SYSTEM USAGE

W. D. PENNIMAN Research Department, OCLC, Inc., 1125 Kinnear Road, Columbus, OH 43212, U.S.A.

and

W. D. DOMINICK

COmPUter Science Department, University of Southwestern Louisiana Lafayette, LA 70504 U.S.A.

(Received March 1979)

Abstract-The paper presents a background survey of the existing state-of-the-art as it relates to monitoring of information systems. It addresses both historical and current approaches and both manual and automated techniques. The general concept of automated monitoring into a well-defined methodology, categorizing the generic uses for monitoring, identifying specific objectives of monitoring and translating these objectives into detailed parameters are developed. Methodologies, techniques and theoretical foundations essential for analyzing monitored data are formulated. Desirable computer-based support requirements for data analysis also are discussed. Conclusions and implications for future research and development efforts in monitoring and evaluation of on-line information systems are highlighted.

1. INTRODUCTION

In recent years, information systems have rapidly changed in response to increased data management and access demands placed upon them by business, government and academia. Often, the changes to information systems are necessitated after a system becomes operational. If information systems are to be truly responsive to users’ needs, change itself must be considered in the process of systems design. Systems design can no longer be viewed as a one-time effort resulting in a static design that is unchanging for the operational life of the system. Rather, it must be predicated on the principle that change of some sort is inevitable and, in fact, desirable.

The problem is, however, determining what must be changed, when it must be changed and what it shodd be changed to. In this light, facilities for measuring and evaluating system capabilities and effectiveness assume a role of critical importance.

AIthough it may be very apparent that a problem exists within an information system environment, it usually is anything but apparent what the solution to that problem might be. For example, if system response time is at an unacceptable level, the problem (poor response time) will be obvious to every system user. What is not obvious is how to solve the problem, i.e. how to isolate and identify the bottlenecks and then eliminate them.

Analogously, it may be quite clear from rumors, complaints, or discussions with specific users that users are dissatisfied with the information system. This dissatisfaction, however, cannot be alleviated unless the system staff can isolate the reasons for the dissatisfaction, e.g. which system capabilities are too difficult to use, which user Ianguage constructs are too error prone, which display formats are unacceptable due to content (too much or too little) or aesthetic reasons.

The intent of this paper is to survey approaches to problems of this nature and to present the underlying concepts of automated monitoring mechanisms within information systems. These concepts are presented as a viable solution to the problem area, both for present and future systems.

The two concepts of prime importance within this paper are the concepts of “monitoring” and “evaluation”. ~o~~~o~~g is defined as the process of collecting data associated with the functioning and/or usage of a system. ~vafuati~~ is defined as the process of analyzing the

17

Page 2: Monitoring and evaluation of on-line information system usage

18 W. D. PENNIMAN and W. D. DOMINICK

functioning and/or usage of a system so that decisions can be made concerning the effective- ness of the system in satisfying its design objectives.

While it is possible to monitor data without evaluating it, this activity is inconsequential and a waste of time. On the other hand, data evaluation without monitoring can be extremely valuable, as in the case of using analytic modeling techniques for certain problems in lieu of analyzing empirical data.

With the exception of parts of Section 2, the measurement-oriented focus of this paper is on automated software monitoring wherein the monitoring functions are implemented in computer software in contrast to being performed manually or implemented in hardware/firmware.

(b) Overview of the paper Section 2 of this paper presents a background survey of the existing state-of-the-art as it

relates to monitoring of information systems. It addresses both historical and current ap- proaches and both manual and automated techniques.

Section 3 develops the genera1 concept of automated monitoring into a well-defined methodology, categorizing the generic uses for monitoring, identifying specific objectives of monitoring and translating these objectives into detailed parameters.

Section 4 formulates the methodologies, techniques and theoretical foundations essential for analyzing monitored data. Desirable computer-based support requirements for data analysis also are discussed in this section.

Finally, Section 5 highlights conclusions and implications for future research and develop- ment efforts in monitoring and evaluation of on-line information systems.

2. BACKGROUND SURVEY

The user’s relationship with an information system has been the subject of study for many years. A landmark series of investigations by the United States Government, the “DOD User Studies”, (AUERBACH, l%5) provided a comprehensive view of the current uses of scientific/technical information systems. Data collection was by survey and interview; data analysis was straightforward and summary in nature.

Those studies are history now. New systems have emerged and users have been sensitized to the potential of on-line access to large bibliographic and data-oriented files. New information services have entered the marketplace and the researcher and technician have found new ways to resolve their information acquisition dilemmas.

At the same time that information systems were going through such radical change, their design process remained relatively stable. A conscientious systems designer first gathered data from existing or potential users, designed the system, tested a prototype, implemented the full scale system and then waited for user reaction that might lead to system modifications (provided the design was able to accommodate change). While lip-service always was given to “user-oriented systems”, the user to whom the system was oriented existed primarily in the designer’s mind and tended to be more systems-oriented than the actual user group.

Occasionally studies appeared in which the user was stressed as an important component of the system design function, one that required special analysis techniques. PARKER (1970) in discussing the SPIRES development experience, identified eight ways of obtaining information about users. These were:

0 literature review 0 in-depth interviews 0 secondary analysis of other survey data 0 questionnaires 0 informal observation and consultation 0 formal interpersonal contact (committees, memoranda, etc.) l message switching facility 0 unobtrusive observation

Unfortunately, these data collection techniques have not been used often and are seldom formalized. An extensive literature search for hard data regarding the user, his characteristics and his information use patterns resulted in a meager find compared to information systems design/implementation activity in recent years.

Page 3: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 19

Yet more meager are the number of studies that result in actual measurements or data regarding user knowledge, attitude, or behavior. This is most distressing in a field where performance measurement and evaluation should be crucial. Lord Kelvin stated our concern most clearly when he said, “. . . when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of Science, whatever the matter may be” (KELVIN, 1891-1894).

Prior to the advent of on-line interactive retrieval systems, the application of unobtrusive measures of user behavior was most difficult. Now it is perhaps the easiest of techniques, using monitor programs built into the information systems.

In order to discuss monitoring as it can be applied today, a review of the alternative techniques is useful. Material presented in the following paragraphs and previous monitor research also reported in this section are summarized in Table I. The following brief review is limited to those techniques that have actually been applied as indicated by an extensive literature search. The review excludes some techniques from behavioral science which could be applied~ such as critical incident analysis (FLANAGAN, 1954).

An early study by MEFXER and SULLIVAN (1967) evaluated user reactions to the prototype NASAiRECON system by frequency of use data and user opinion. The data and opinion were collected by observation, interviews, questionnaires and controlled experimental tests. Un- fortunately Meister and Sullivan ran into operational difficulties with the prototype system and only conducted four or five controlled performance tests. Note that the controlled experimental test technique is missing from the previously presented list by PARKER (1970). There have been a few such experiments reported in the literature, however. SEVEN et al. (1971) reported on an experiment to evaluate controlled pacing of user interaction (via lockout) in a problem-solving environment. MELNYK (1972) tested the influence of interface features on user frustration in an information retrieval task. CARLISLE (1974) evaluated interface complexity impact on effective- ness of problem solving for users with varying psychological characteristics in a gaming situation. BACK (1976) developed and applied a procedure for analyzing problem-solving behavior of users in an interactive reference retrieval system. RINEWALT (1977) reported on a series of feature evaluation experiments in which various user groups were denied selected system features to determine the impact on user performance and system load.

The previously mentioned work by Meister and Sullivan is important because it involved a variety of techniques, was concerned with user attitude as well as action and compared off-line to on-line system performance. Most important, the study was conducted as part of a design evaluation procedure prior to full implementation of the system. With respect to on-line system design, they concluded: “It appears mandatory that before such an advanced system is produced, its prototype should be exposed to test with a sample of users to determine and eliminate those features of the system which may be objectionable to the user population” (p, 16). Their study incorporated a series of 7-point evaluation scales with which users could rate system factors such as ease of use, speed, worth and acceptability. Several items using S-point rating scales also were incorporated in the questionnaire. In addition, a 31-item questionnaire was completed by 37 subjects. The initial interviews were informal, in-depth, small group discussions climaxed by completion of the evaluation form. The questionnaire was completed after several weeks of system usage as was the controiled performance measure mentioned previously.

Data relative to system usage also was collected in the study. The percentage of successful searches was considered the most meaningful measure of performance. A successful search was defined as a search that resulted in the viewing of one or more titles. The data collected automatically included number of searches conducted, number of citations viewed per search and number and type of search terms entered. The method of “automatically” collecting this

i’A current project at OCLC is evaluating a wide array of measurement techniques from the behavioral sciences as well as other disciplines in order to develop a “measurement handbook” for research on user behavior.

Page 4: Monitoring and evaluation of on-line information system usage

Tabl

e 1.

Sum

mar

y of

dat

a co

llect

ion

stud

ies

for

info

rmat

ion

syst

em

user

atti

tude

an

d be

havi

or

Dat

a Co

llef

tion

Technique

Related W

ork

Coaa

nent

IllteIVieW

Barrett *a.

(1968),

Meisfer and Sullivan (1967);

Peace and Easterby (1973)

In-depth, focused interviews could

provide useful data - particularly

if critical incident techniques are

applied.

Questionnaire

user

Expert Panel

Plonitor of system Usage

Kennedy (1975); L

ucas

(1978).

Barrett , $t

t. (1968): Katzer

(1972); Levine (1978) Meister and

Sullivan (1967); Peace and

Easterby (1973); Wanger, " &.

(1976);

Borntan and Dominick (1978)

Martin et al. (1973)

,_-

More work is needed in design and

validation of such instruments.

Scales are needed which are oriented

toward information system evaluation.

Expert panel may merely support cut-

rent biases regarding user‘characteristics.

By human

Kennedy (1975); Meister and Sullivan (1967)

Time-consuminS, but useful in analyzing

the conceptual processes of the "8et8

if they discuss StrFategieS

as they

search.

By computer

Kennedy (1975): Lucas (1978); Mittman

and Dominick (1973). Penniman (1974)

Meister and Sullivan (1967); Urban and

Dominick (1977)

Despite Sreat potential for this

method of system evaluation, it Is

is not yet widely applied.

By other device

Powers s &.

(1973)

Measured physiological variables while

subjects used on-line system. Video

recordings of us.?‘@ could also be

used - but none have been reported.

Secondary Data

Lucas (1978);

Used personnel records to gain infor-

mat

ion o

n us

ers b

ackground (note pri-

vacy issue)

Controlled Experiments

Meister and Sullivan (1967); Seven, et al.

This area needs a great deal rare! atten-

(1971); Melnyk (1972); Carlisle (1974);

tlon with respect to information

Back (1976); Rinewalt (1977); Dominick

retrieval system-oriented experl-

and Urban (1978b)

mental studies.

r P

Page 5: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 21

data was not described, but it could have been by a system interaction monitor or by retaining hardcopy from the search process.

Another system evaluation study conducted at about the same time involved the CIRC system serving the intelligence community (BARRETT et af., 1%8). This study used questionnaire and interview techniques to evaluate user satisfaction with the system. The total number of respondents is not reported, but only 10% of the questionnaire respondents were included in the interview process. This 10% subsample included users who indicated both satisfaction and dissatisfaction with the system.

The questionnaire contained questions involving magnitude estimator scales, graphic rating scales, quantifiable answers and open-ended opinion responses. Eight systems (including CIRC) were compared for importance to the subject’s job, ease of use, satisfaction and familiarity. In addition, expected lag time between publication and availability of documents was probed as well as user desire for evaluated output. Open-ended questions covered search areas, current awareness, profiles, training, personal files, reasons for not using system nomenclature, liked and disliked features of the system and suggested changes.

The interviews were used to investigate job duties, favorable and unfavorable experiences with the system (close to a critical incident technique), influence of system on task per- formance, system deficiencies, adequacy of indexing, non-system material required and know- ledge/training regarding the system.

The results of the interview and questionnaire techniques were analyzed for key factors in determining user satisfaction with the system. Key factors included training and level of proficiency, amount of relevant information in the system and the user’s tolerance of irrelevant information. A model of satisfaction was prepared from the analysis results.

The questionnaire technique again was used by KATZER (1972) to assess users’ attitudes toward the SUPARS system at Syracuse University. In this study, 71 subjects were asked to complete 20 semantic differential questionnaires. Factor analysis resulted in three independent dimensions being identified as those which users apply in reacting to an on-line system. The factors were: evaluative-specific (i.e. evaluation of the system), desirability and enormity. A semantic differential scale then was developed from these factors. The scale still requires validation and, as the author points out, may be specific to the SUPARS system.

PEACE and EASTERBY (1973) applied construct theory to investigate the conceptual framework of users of information systems. This technique, derived from the field of psychotherapy, uses a device catled a repertory grid to elicit and relate conceptual categories or labels applied by individual users to various system characteristics. The technique is useful in exploring in- dividual differences but is quite difficult to apply across large samples of users unless significant constraints are placed upon the technique. (See BANNISTER and MAIN, 1%8 for more details.)

WAGNER ef al. (1976) conducted one of the most comprehensive user surveys to determine the impact ‘of on-line retrieval services with a heavy emphasis on the service supplier’s perspective. Their study involved over 1200 respondents about two-thirds of whom were classified as “searchers”. The sample involved users of a variety of on-line services so individual system details were not evaluated. It is significant that this study represented an effort in which competive commercial vendors cooperated to collect and disseminate in- formation on the generic class of systems providing on-line information retrieval services.

LEVINE (1978) has recently proposed a user-interview~data base evaluation technique that compares user opinion to data base content. In interviews, users are asked to rank order searchable fields according to perceived importance. The data base then is scanned for content by field and fields again are rank ordered, this time by the data base content. The two rank-ordered lists are compared by non-parametric statistical tests to determine correlation and, therefore, appropriateness of the data base for use requirements. This attitudinal approach to data-base evaluation offers new insight into system requirements based on user perceptions.

Prior to L,evine’s proposed techniques, other techniques such as used by MARTIN et al. (1973) involved expert judgment of information scientists regarding desirable system characteristics. In Martin’s study, 19 expert subjects responded to a 147-item questionnaire. Consensus was measured using probability of agreement by chance of 0.025 for more than 15 or fewer lhan 5 respondents. Consensus was reached on 70 of the questions involving system interface design. Several of the questions in which experts reached agreement involved aspects of on-line

Page 6: Monitoring and evaluation of on-line information system usage

22 W. D. PENN~MAN and W. D. DOMIKICK

monitoring, including the concept that the user should be able to monitor the system for resource usage and the system should be able to monitor the user for usage patterns/problems.

A final user attitude study by POWERS et al. (1973) actually involved monitoring of a special type to infer user anxiety. In this case, 18 subjects were asked to use the GIPSY information processing system at the University of Oklahoma. These subjects were categorized into three groups: 1. no previous computer exposure, 2. minimal computer exposure (5-10 applications) and 3. advanced computer exposure (more than 10 applications). During system usage the subjects were monitored for blood pressure, heart-rate and electro-dermal response. These physiologicai measures were compared across time for each group and across groups. Results indicated that elapsed time at the terminal rather than experience was the major factor influencing the measured physiological variables.

This study provides an excellent point to introduce work in which unobtrusive monitoring of user-system interaction in an information system environment has a major emphasis. These * studies are few in number, but offer considerable promise for providing concrete data on user behavior. Such data when correlated with other variables (e.g. attitude, experience, training) can hetp the system designer or data base administrator to make necessary decisions regarding system design/modification.

(b) Studies of unobtrusive monitoring of user-system interaction ATHERTON et al. (1972) reported on data collected during use of the SUPARS system by a

monitor which kept track of search terms used within each session. Very little data was presented on monitor results although the system included statistical analysis routines to provide basic use data.

OILMAN and DOMINICK (1973) provided a detailed analysis of 130 on-line sessions involving over 50 hours of interaction. Several variables were recorded including real-time, CPU time, number of search terms, number and type of user errors, number of records scanned, number of print reports, number of display reports and user comments. The RIQS-ONLINE system at Northwestern University in which this monitor was applied offered a rich source of data for user evaluation. Much of the monitoring philosophy and procedures presented within this paper are derived from work initiated by Dominick on the RIQS system.

PENNIM~N (1974) reported on a similar study involving the BASIS system at Battelle Columbus Laboratories. User action and timing was recorded for over 900 sessions involving over 200 hours of interaction. In this study the pattern or order of user actions was evaluated in conjunction with the real-time pacing of the actions. Again, many of the concepts and procedures presented in this paper resulted from this earlier study of the BASIS system.

KENNEDY (1975) reported a study involving 50 hours of interaction over 35 users for a medical data entry system in a hospital environment. This study consisted of an attitudinal pretest and visual observation of the users as well as monitoring via computer. Several variables were evaluated including commands used, elapsed time and number and type of text entered. Kennedy points out that monitoring can be a means of adapting the system to user ability and a changing environment-a point that will be elaborated in this paper.

One of the most promising pieces of work reported recently is by LUCAS (1978) and involved use of a retrieval system oriented toward medical research. It is promising because it not only involved an on-line monitor collecting data on session variables for 27 subjects, but also it involved a questionnaire for collecting background, personal data, research experience and productivity information, attitudes and system use for 180 subjects. The research began with a descriptive model which provided a number of hypotheses regarding user characteristics, attitudes and behavior. These hypotheses were then tested using the data collected by the questionnaire, personnel records and monitoring.

The material presented in this section and summarized in Tabie 1 indicates the variety of methods used in the past to evaluate user characteristics and behavior. It also illustrates the need for broader application of available techniques. Too few existing information systems are applying these methods of user evaluation; too few researchers are refining these methods or developing new methods for user evaluation. The following sections elaborate on the most recently available technique for evaluating user behavior, i.e. monitoring. Monitoring promises to provide abundant data if properly incorporated into emerging information systems.

Page 7: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage

(c) Current monitoring techniques

23

If there is a major barrier to the useful application of monitoring techniques to system

design/evaluation, it is the user’s fear of invasion of privacy. None of the major commercial on-line service vendors (BRS, SDC, Lockheed) reports the capability and/or use of an on-line monitor to determine user patterns or problems. The reason they give for not using a monitor is fear of loss of patrons if system use is monitored. Minimal monitoring (e.g. user ID, session length, print requests, system resources used) is used, however, for proper charging in systems where cost is to be recovered. With proper guidelines, it should be possible to expand this

monitoring to include other useful variables for system design and evaluation. User permission would be required for some levels of monitoring, but in general there should be little problem with content-free monitoring (i.e. no collection of actual search terms entered or specific documents displayed/printed). A brief evaluation of the privacy issue is presented here because it is so important to the future development of a most promising research/development tool.

If we view monitoring as a multi-level process we can collect data at three general levels. These levels are:

0 complete protocol-includes verbatim records of user/system interaction for an entire session or selected portions of the session. An indication of system resources in use (i.e. status reports) and clock time would also be included. l function or state traces-maps protocol onto a predetermined set of categories or states at

the time of data capture. This technique can be used to mask specific user actions which might indicate subjects or topics searched and specific documents retrieved.

0 General session variables-records such variables as sign-on and sign-off times, data bases accessed, resources used and number of documents retrieved/printed. This is minimal information needed for cost recovery in most cases.

Monitor data can provide: 1. direct and immediate diagnostic aid to the user, 2. grouped or individual user evaluation, e.g. analysis of user performance, user success/satisfaction, 3.

system protection, e.g. diagnosis of attempts at unauthorized system access and 4. system evaluation. Table 2 illustrates the areas most sensitive to user privacy and indicates why there is so little reported evaluation of monitor data in the available literature. This table does not indicate users” perceived privacy threats which could well be greater than real threats.

:1. MONITORING METHODOLOGY,OBJECTIVES AND PARAMETERS

As characterized in the previous section, automated monitoring represents an extremely powerful and potentially flexible technique for collecting data necessary to perform evaluations of information system performance and analyses of user interactions with the information system.

To provide a perspective of the potential for utilizing automated monitoring mechanisms within information system environments, this section identifies some of the basic uses for

Table 2. Privacy impacts on monitor application to information systems

General Session Variables

Function or State Traces

Complete Protocol

Real-Tim Individual Grouped User User Diagnostics User Evaluation Evaluation

Low

.Low

LOW'

Low I

Low

Medi urn I Low

High Medium**

I

* Trace can be highly volatile

l * Can be high for proprietary groups/data bases

Computer System Computer System Protection Evaluation

Low Low

-l----l Low Low

High*** 1 High 1

l ** Currently a common practice with daily transaction files

IPM Vol. 16. No I-D

Page 8: Monitoring and evaluation of on-line information system usage

24 W. D. PENNI~~N and W. D. DOMINICK

monitoring. It also presents the general methodology applicable to information system monitor- ing and identifies specific monitoring objectives and parameters,

(a) PotentiaI uses for monitoring

The potential uses for monitoring can be broadly categorized as follows: 1. The use of monitoring to support evaluations of the execution efficiency of the information

system, 2. The use of monitoring to support evaluations of the interfaces between the information

system and the other components of the computing environment, 3. The use of monitoring to support evaluations of the interfaces between the user and the

information system. The execution efficiency of an information system depends upon numerous factors (URBAN

and DOMINICK, 1977). These factors range from the implementation environment (machine and storage device characteristics, sophistication of supportive operating system features, capabili- ties of the system’s implementation language and the efficiency of generated object code) to characterization of individual data bases supported (data base size, structuring and content).

During execution, an information system interfaces with several levels of supportive tech- nology which may include one or more hardware-implemented machines, firmware- or soft- ware-implemented virtual machines, operating systems and storage device hierarchies. The entire process of execution is, of course, initiated by the information system inter~cing with the user.

Monitoring can be utilized effectively in each of these areas to collect data necessary to evaluate and improve internal system execution efficiency, to optimize information system use of supportive computing resources and to analyze and improve the user interface to the information system.

(b) General methodology

The general methodology applicable to the monitoring and evaluation of any computer- based system can be subdivided into the following phases:

1. Determine the monitoring/evaluation objectives, 2. Determine the specific parameters to be monitored initially, based upon the overall

objectives, 3. Design and implement the monitoring facility into the system, 4. Design and implement the data analysis tools to be used in analyzing the monitored data, if

such analysis tools are not already available, 5. Design and conduct the monitoring experiment to collect the data to be analyzed, 6. After the experiment has been completed, perform the data analyses, making evaluations

and drawing conclusions, as appropriate, 7. Identify system improvements and enhancements as implied by the results of the analyses, 8. Identify monitor improvements and enhancements as implied by the results of the

analyses. This may involve adding new parameters that were found necessary, deleting existing parameters that were found not necessary, or modifying existing parameters to collect more detailed or more aggregated data,

9. Identify experimental design improvements and enhancements, 10. Apply the results of phases 7 through 9 to implement the identified improvements and

enhancements to the system. to the monitor and to the design of the data collection experiment, II. After a period of time which depends upon the initial objectives, cycle back through

phases 5 through 10. The general phases identified above are relevant to the monitoring and evaluation of any

type of computer-based system, whether it be an information system, an operating system, a management information system, a decision support system, or any similar type of system (see DOMINICK and URBAN, 1978b).

A schematic diagram illustrating these phases of the general methodology is presented in Fig. I.

Page 9: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 25

(1) (2) (3)

DETERMINE DESfCN &

OETERMINE PARAMETERS IMPLEMENT

MONITORING r‘-r TO BE -SOFIWARE -

(IO)

MONITOR

IMPROVEMENTS r

INI ITIi

(I(r) jiGi&-- INFORMATION SYSTEM

r

1

IMPROVEMENT

Fig. 1. fnformation system monitoring and evaluation schematic (see General ~e~~u~~~~gy section for discussion of phase numbers identified in parentheses).

(c) Potential objectives of monitoring While the previously identified general methodology is applicable to the monitoring and

evaluation of any computer-based system, this paper focuses on information systems. Within the context of information systems, the above methodology may be applied to a wide range of potential monitoring and evaluation objectives (BORMAN and DOMNICK, 1978). Some of the types of objectives which could be addressed for information systems include:

1. Comparative analysis of two or more versions of the information system to ident~y the relative improvements made in one version over a previous version.

2. Comparative analysis of two or more structurings of a data base to identify the relative improvements made in one structuring over a previous structuring.

3. Analyses to determine the efficiency of interactions between the information system and its supported data bases or between the information system and its supportive operating system and computer hardware.

4. Analyses to depict the usage of the information system or of any specific data base maintained by that information system. This involves identifying profiles of user behavior and analyzing user interaction patterns with the information system and with specific data bases.

5. Analyses to det~r~e user success and satisfaction and the information system.

Page 10: Monitoring and evaluation of on-line information system usage

26 W. D. PENNIMAN and W. D. DOMINICK

The first three objectives relate primarily to the area of computer system performance measurement and evaluation. Objectives 4 and 5 relate primarily to the area of user interaction analysis. Since the performance of the information system certainly can affect the behavior pattern of users interacting with that information system, it can be argued that user interaction analysis cannot be entirely divorced from system performance measurement and evaluation.

Acknowledging these interdependencies, this paper addresses monitoring both within the context of system performance measurement and evaluation and within the context of user

interaction analysis. Monitoring is performed to assist various categories of individuals in performing their tasks

associated with information system usage. The three generic levels or categories of users toward which monitoring can be directed are the system administrator level, the data base

administrator level and the user level (DOMINICK, 1977; J~OMINICK and URBAN, 1978a). The objectives of monitoring to assist individuals within each of these categories are addressed

herein. (1) Monitoring to assist the system administrator. The System Administrator level includes

the manager/administrator of the information system and the system’s design, implementation and maintenance staff. The primary responsibilities associated with this level include overall system management, system testing, system enhancement and system performance measure-

ment and evaluation. The objectives of monitoring intended to assist the information system administrator level

would include the following:

1. The collection of system usage profiles to ascertain how the system is being used: (a) The type and complexity of the operations performed. (b) The type and context of errors made. (c) The time spent within various components, i.e. subsystems, of the system. (d) The cost of

system and subsystem usage. 2. The collection of data for supporting data base performance evaluation: (a) CPU time

statistics. (b) I/O time or paging activity statistics. (c) Real time statistics. (d) Data base storage

requirement statistics and accessing time statistics. 3. The collection of data for supporting monitor overhead evaluation: {a) Monitor code

storage requirements. (b) Monitor data file storage requirements. (c) Monitor execution time and cost statistics.

4. The collection of data for supporting prediction and projection analysis: (a) Predictive equations for operation execution timings and costs. (b) Predictive equations for data base storage requirements. (c) Predictive equations for monitor overhead. (d) Projection of system performance over a new or modified expected usage pattern, system load, data base structure, operating system, etc.

(2) i~~~2jt~~~~~ to ussist the data base u~rni~~~t~~tur. The Data Base Administrator level includes the administrator of a particular data base and the analysts and application program- mers supporting that data base. The primary responsibilities associated with this level include data base definition, security control, maintenance, application programming and data base redefinition/restructuring when necessary.

The objectives of monitoring intended to assist the data base administrator level would include the following:

1. The collection of data base usage profiles to ascertain how the data base is being used: (a) The records and items searched most frequently. (b) The records and items output most frequentIy. (c) The keywords and index terms used most frequently. (d) The type and context of errors made. (e) The cost of data base usage.

2. The collection of data for supporting data base performance evaluation: (a) CPU time statistics. (b) I/O time or paging activity statistics. (c) Real time statistics. (d) Data base storage requirement statistics and accessing time statistics.

(3) Monitoring to assist the user. The objectives of monitoring intended to assist the information system user would include the following:

1. The collection of complete system usage profiles and data base usage profiles so that the system administrator and data base administ~tors could respond to user needs and user requirements as exhibited by usage patterns. Actions could include redesigning and/or enhancing system features or data base content.

Page 11: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 27

2. The collection of complete system performance data and data base performance data so that the system administrator and data base administrators could improve user productivity by optimizing system performance or data base structuring.

Improving the service to the user may take the form of adding new features or optimizing the performance of existing functions. It may involve redesigning the user language, improving diagnostic messages, developing tutorial sequences, or enhancing documentation, e.g. system users manuals, data base guides, sample searches. Real-time user aids based on monitor data can take the form of tailored diagnostics, proactive prompting and individualized dialogue content.

Interaction monitoring can be used to collect the data necessary to make accurate evalua- tions of current effectiveness and to make intelligent decisions concerning how to improve effectiveness at each of these levels.

(d) Identification of potentially relevant parameters This section first identifies the minimal required set of parameters that are necessary to

support the types of monitoring and evaluation objectives identified previously. Then certain considerations are discussed relevant to environment-dependent, system-dependent, or ap- plication-dependent extensions to this minimal required set.

(1) Minimal required set. A minimal required set of parameters is defined for three basic categories of evaluation considerations: system usage profile and data base usage profile data measures; user error and error recovery data measures: and user success and user satisfaction data measures.

a. System usage profile and data base usage profile potential data measures. Potential

data measures for determining system usage profiles and data base usage profiles would include the following types of data items:

1. User’s name and user’s affiliation. This information can be used to determine who are the users of the system and who are the

users of specific data bases being maintained by the system. Affiliation is suggested in addition to name since the same user may, at different times, be accessing the system as a member of a different user community. For example, at one time a user may be a member of a class (faculty member or student), at another time a member of an administrative staff (a project manager or team member) and another time an independent researcher. While information such as user’s name and user’s affiliation is desirable to have, this information must be collected optionally at the discretion of the user to ensure that a user’s right to privacy is not violated.

2. Date of interactive session. This information can be used to determine how often users use the system or a particular

data base and how this use is distributed. 3. Real time the session started and finished. This information can be used to determine distributions of session length in terms of real

(clock) time.

4. CPU time the session started and finished. This information can be used to determine distributions of session length in terms of

computer time resources. 5. Real time and CPU time durations for major phases of system processing. This information can be used to determine how much time is spent performing the various

types of processes supported by the system, e.g. search and retrieval, output processing, data base maintenance operations, tutorial invocations.

6. Operation execution counts. This information can be used to determine how many operations of various types are performed.

7. The full text of the operations requested. This information can be used to determine how complex the operations requested are or which specific types of available options within operations are used, e.g. which types of output reports are generated based upon the types of report formatting options used.

8. Detailed, context-dependent statistics for those operations of primary importance. For specific types of operations of primary importance, detailed statistics associated with that type of operation can be used to determine a full profile of the use of such operations. For example,

Page 12: Monitoring and evaluation of on-line information system usage

28 W. D. PENNIMAN and W. D. DOMINICK

within information systems the operation of data base searching is of primary importance; detailed statistics associated with each search performed should be collected. Such statistics could include the following: (a) Real time for search start. (b) Real time for search finish. (c) CPU time for search. (d) Number of data base records searched. (e) Number of data base records retrieved. (f) Number of search terms employed. (g) Number of search errors made. (h) Context of each search error made. (i) Full text of the search criteria specification.

9. User ratings of the interactive session. This information can be used to determine user reactions to information system features and capabilities.

10. User comments on the interactive session. This information can be used to determine user reactions to information system features and capabilities as well as to obtain user suggestions for system improvement and enhancement.

11. Session cost. This information can be used to determine distributions of session cost. It should be noted that a great variety of potential data measures can be generated from the

above types of raw data measures. For example, given the number of records retrieved from a search and the total amount of real time for that search, a measure of unit retrieval cost in real time can be generated. Similarly, given the number of search commands executed and the number of search terms used per search command, a measure of search complexity adjusted by the number of search terms employed can be generated.

b. User error and error recovery potential data measures. Potential data measures for determining information about user errors and error recovery techniques would include the following types of data items:

1. Error frequency counts. This information can be used to determine what types of errors are being made, how frequently these errors are being made and to what extent one type of error is being made in conjunction with another type of error.

2. Error context. This information can be used to determine in which contexts specific types of errors are being made. For example, is the error due to a simple misspelling of a system keyword or due to an attempted use of an invalid construct in the user language? Error context can also be used to determine the complexity of the attempted operation and the types of operations which are most error prone. This would provide implications for user language redesign, for documentation improvement, for tutorial emphasis, or for highlighting the need for illustrative sample searches using such error prone constructs.

If the above information is collected in a time-ordered sequence, error recovery techniques could be examined. This would entail analyzing the number of user retrie necessary to correct an error, analyzing the extent to which non-repetition of specific types of errors is evident (possibly implying that a learning process has taken place) and analyzing what specific approaches users take to recover from errors. With the latter technique, it could be determined to what extent these approaches involved a major or minor reformulation of the search as it was originally expressed by the user.

c. User success and user satisfaction potential data measures. Potential data measures for ascertaining information concerning user success and satisfaction would include both direct or explicit measures and surrogate or derived measures.

Direct measures represent explicit statements by the user rating or commenting on success or satisfaction. Direct measures would include:

1. User ratings. This information, obtained either from rating questions directly asked by the information system or from rating questions on system usage quetionnaires, can provide direct statements of user success and satisfaction.

2. User comments. This information, obtained either from comments entered by the user via a system comment command or by comments entered on system usage questionnaires, can provide direct statements of user success and satisfaction.

Such direct measures are, however, rarely available. Rating questions cannot be asked after each user operation since such frequent questionning would interfere with the user’s primary objective of obtaining useful information from the system. Similarly, user comments are often not available; users rarely bother to comment on their success or satisfaction. Thus, while direct measures are, in fact, the perfect measures of user success and user satisfaction, they are not available often enough. Additional measures are needed.

Surrogate measures represent measures that are derived from various user interaction data,

Page 13: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 29

but which do not include any explicit statements of user success or satisfaction. Due to the level of detail required for specifying such measures, a discussion of surrogate measures is beyond the scope of this paper (see BORMAN and ~MINICK, 1978). A detailed identi~cation of potentially relevant surrogate measures of user success/satisfaction, together with experimental results obtained from applying such surrogate measures to a large body of monitored data, will be the subject of a future paper.

(2) Extensions to the minimal required set. The parameters just identified are not totally comprehensive, nor are they necessarily applicable in their entirety to every information system environment.

While those parameters are minimal in order to be able to support the types of analyses and evaluations overviewed under “Potential Objectives of Monitoring” there are certainly parameters which cannot be generalized due to their inherent environment-dependent nature, system-dependent nature, or application-dependent nature. In such cases, extensions to the minimal required set of parameters are appropriate.

a. Environment and system dependencies. Parameters that are environment-dependent are dictated by the particular characteristics of, or constraints on, the environment in which the information system is functioning. Similarly, parameters which are system-dependent are dictated by the specific processing capabilities of the system itself.

An information system may be operating in a computing environment in which disc storage is an extremely critical resource due to numerous contentions for limited disc space. In such an environment, it may be necessary to define very detailed parameters for monitoring the efficiency of the data base storage space allocation algorithms to support evaluations of the effectiveness of disc storage utilization.

In another environment, there may exist absolute constraints on the level of acceptable average response time. In this situation, it may be appropriate to extend the minimal parameter set to include very detailed timing statistics that characterize each stage of transaction processing to provide the framework for overall system response time optimization.

In yet another environment, the supportive computer hardware may be configured into a system network architecture comprising various front-end processors for data communication and network supervisor functions, various applications hosts for implementing the required applications sub-systems and various back-end processors for data base management. The set of parameters to be monitored in such a network environment may differ considerably from those relevant to an information system implemented on a single processor. Monitoring research addressing information systems which function within multiple-processor, network environments is currently being addressed by ~MINICK and P~NNIMAN (1979) and by DOMINICK (1978).

b. Applications dependencies. Parameters which are application-dependent are dictated by the specific characteristics of the application(s) being supported.

Within an information system designed to support library functions, a cataloging ap- plications subsystem may allow users to retrieve bibliographic records, edit the retrieved records and format the data for catalog card generation. Application-dependent monitoring may be desired to provide information concerning the number of records retrieved via that subsystem, the types of editing performed and the number of catalog cards generated.

Numerous other examples obviously exist. The important point to realize is that while a considerable number of monitoring parameters can be defined in a generaiizabie manner, each environment, each system and each application may dictate the need for specific monitoring parameters tailored to the characteristics of that environment, that system and that application.

4. ANALYZING MONITOR DATA

On-line monitoring provides a means of collecting more data than we have ever had before on information system user behavior. The problem quickly becomes one of data reduction and analysis-not data availability. It is important that the data reduction and analysis follow a systematic set of procedures or little will be gained from this relatively new data source.

There are two basic mechanisms for systematizing the analysis. First, a traditional research approach could be used in which a model or theoretical framework is developed and hypo- theses are derived that can be tested with the data. Data analysis derives in this case from

Page 14: Monitoring and evaluation of on-line information system usage

30 W. D. PENNIMAN and W. D. DOMINICK

careful experimental design and hypothesis testing using traditional statistical techniques (LUCAS, 1978). Second, an exploratory approach can be taken in which data are analyzed for their descriptive value, and theory flows from the data analysis (GLASER and STRAUSS, 1967).

It can be argued that user on-line interaction has historically suffered from a paucity of theoretical development as well as empirical data. Yet, there are several fields which can provide the theoretical frameworks necessary for a unified data analysis process. The three most useful areas are communication research, systems and cybernetics research and computer science research in terms of previous theoretical foundations for analysis of monitor data. For generation of new theoretical foundations, the method used in the social sciences of “com- parative analysis” (GLASER and STRAUSS, 1967) is most promising. The following section describes briefly what each of these areas has to offer as a theoretical framework for monitor data analysis.

(a) Sources of theoretical foundations Communication research has provided several promising frameworks for human/computer

interaction analysis. The techniques for categorizing and tracking the communicative actions of humans as they pursue a common goal have been used in a variety of studies (BALES, 1955). This .same approach can be applied to human/computer interaction. Human communication studies focusing on dyadic (two-person) interaction are most interesting because they resemble in form the human/computer dialogue (JAFFE and FELDSTEIN, 1970). Evaluation of human/human dyadic communication also can provide insight into appropriate command languages for human/computer interaction (MANN, 1977).

Many behavioral researchers have tried to use the communication theory of SHANNON and WEAVER (1972) in evaluating the human communication process. While this stretches the theory beyond its original intent, the concept of uncertainty reduction applied to purposeful entities (humans) was expanded by ACKOFF (1958). In general, the approach seemed to preclude human/computer interaction because of the mechanistic and deterministic nature of such interaction. At best, the interaction would seem to be a two-way monologue, but a two-way monologue is a simulation of “real” conversation, as pointed out by AMBROZY (1971). Not only do communication researchers in the behavioral sciences offer new ipproaches in theory development in this area, but they also have valuable data analysis techniques which can and have been, applied in this area (PENNIMAN, 1975). Such tools as analysis of variance, factor analysis and Markov process analysis are equally useful in human/human and human/computer interaction studies.

Systems and cybernetics research provides a second major source of theory development and data analysis techniques. System analysis has evolved from earlier operations research activity into a well-reorganized and described discipline (CHURCHMAN, 1%8). The concepts in system analysis include identification of the components of the “system” to be studied, evaluation of these component interrelationships and analysis of their interactions. The systems sciences rely heavily on mathematical analyses for system identification and evaluation (GRAUPE, 1976). General systems theory expands the approach to complex or large systems which may not be completely decomposable and offers insight into theory building and data analysis where no existing discipline provides adequate tools (WEINBERG, 1975). The application of these techniques to problems of human communication has given researchers a more rigorous and quantitative approach to “soft” systems (RUBEN and KIM, 1975). A systems approach to the evaluation of information services and products has been developed by KING

and BRYANT (1971). Cybernetics, or the study of control and communication processes in machine and animal

(WIENER, 1975) is so closely allied to systems analysis that it is included here as a corollary tool for theory development and data analysis with respect to human/computer interaction. ASHBY (1%3) provides a clear and useful description of the basics of cybernetics and suggests several useful tools for interaction analysis including process-oriented state analysis. Here again, the analytical techniques are quantitative in nature and require matrix manipulation techniques, statistical probability functions and other mathematical techniques.

The third area from which information system researchers can benefit is computer per- formance analysis-a discipline within computer science (SVOBOD~VA, 1976; DRUMMOND, 1973).

Page 15: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 31

The models, theories and analytical tools used in computer system performance measurement and evaluation ought to be well-suited to the task at hand. After all, a portion of the “system” monitored in human/computer interaction is the computer system itself. Approaches to moni- toring, modeling and data reduction have been well-defined by computer scientists. Un- fortunately, their evaluations seldom attempt to model the human component of the system except indirectly as it impacts on computer utilization. A recent effort at the National Bureau of Standards (ROSENTHAL et al., 1976) shows promise of combining user and computer evaluation via multiple probes for a network measurement machine. Data evaluation methods employed in computer-oriented evaluation include factor analysis, regression analysis, stochastic analysis and other quantitative techniques. Graphical analysis and presentation techniques assist the computer specialist in evaluating large quantities of diverse data.

Regardless of the discipline from which techniques may be borrowed. there are two basic approaches to theory development and, consequently, ways of gathering and analyzing data as indicated earlier. The first, the ~~g~c~-deductj~e approach, represents the classic technique of doing research. This approach relies on a priori assumptions, theory development. hypothesis testing and verification or rejection as well as theory acceptance, rejection and modification. Most experimental research relies upon this classical approach. Yet, interactive monitors in controlled experimental settings are not common (see earlier discussion in Section 2). This is partially due to the limited experimental research in human/computer interaction and partially due to the limited application of monitors currently in information system design and evalua- tion.

The second approach to theory development (and data collection/analysis) is empirical in nature and heavily grounded in systematically collected data. The “grounded theory” approach is most promising for researchers interested in information system monitoring because the approach forces one to be sensitive to all possible theoretical relevances of the data (GLASER

and STRAUSS, 1%7). The analytical techniques are no less rigorous than those used in hypothesis testing. However, “fishing expeditions” into the data are not only accepted, but encouraged. Given the present stage of development in information system monitor-oriented research, such flexibility is highly desirable. With this approach, the widest possible array of data analy- sis~presentation toofs are required.

(b) Tec~n~~aes and methodologies for data analysts and presentation

The techniques for analysis of data collected via an on-line monitor can range from “eye-balling” to sophisticated multivariate analysis. Given the large amounts of data made available via monitoring and the difficulties in extracting representative samples at this early stage of research the former technique is not very useful as an exclusive approach. “Eye- balling” should never be ruled out, however, since it often provides a direct feel for the interaction process; any reduction technique, of necessity, loses some of the flavor of the interaction. One of the closest methods to “eye-balling” which still takes advantage of computerized analysis methods is a direct content analysis of the user input and system response text, If data are reduced to graphic form, the “eye-ball” method is useful in exploratory research. Since content analysis is the closest formal method to this approach, that is where the discussion of analysis techniques begins.

(I) Data analysis. Content analysis of user input/system output can include a variety of counting procedures based on predefined words/phrases/commands of interest or on concor- dances developed from the recorded transactions ex post facto. Often, content analysis can be performed on the basis of system flags or parameters rather than message content. This is because the interactive system being evaluated has to perform some type of content analysis on the user input at the time of processing in order to respond appropriately. If such flags are

recorded by the monitor, then content analysis may be nothing more than sorting/counting the occurrence of specific flags or state codes of interest.

A variety of numerical data is typically recorded by an interaction monitor. The relation- ships between variables represented by such data are the prime interest of the researcher or designer. Using this data, an entire range of parametric and non-parametric statistical analysis techniques can be performed (e.g. PENNIMAN and PERRY. 1976). In addition to tests of significance, other techniques such as structural analysis, regression analysis, correlation

Page 16: Monitoring and evaluation of on-line information system usage

32 W. D. PENNIMAN and W. D. DOMINICK

analysis, and factor analysis techniques are useful (e.g. OILMAN and ~MINICK, 1973 and DOMINICK, 1975). Mu~tivariate analysis techniques, in generat, are most appropriate for initial exploratory research regarding user/system interaction (e.g. BORMAN and DOMINICK, 1978).

Besides the more traditional statistical analysis methods, there are more complex methods of evaluating the data, including stochastic process analysis and other “pattern seeking” approaches. One technique that has already been tried and proven useful involves Markov analysis of sequences of user/system interactions. By treating the interaction sequences as Markov chains, a variety of statistical methods are available for detailed data analysis (see PENNIMAN, 197.5). In addition to first-order analysis (Markov) involving a one-step history analysis, the data can be analyzed for simple distribution (zero-order) or more complex history patterns (n-order).

(2) D&u ~~e~en~~~~o~. Several presentation methods are useful in evaluating interaction monitor data. Table 3 presents a list of methods as a summary of the variety of tools available. Indications of applications of each presentation method in a monitor environment are also noted. Kiviat diagrams (ELLIOTT, 1976) represent a useful method of presenting several variables simultaneously in a single graph and offer a simple means of comparing several interaction

Table 3. Data presentation techniques with citations tn examplec of each in an interaction monitor environment

METHOD CITATION EXAMPLES OF VARIABLES PRESENTED

Tabular Display

Bar Charts or Histograms

Co~ination Plots (Multiple variables

on the same graph)

Frequency Distributions or Ogives

Point Plots

Point to Point Curves

Scattergrams

Smoothed Curves or Linear Fits

Kiviat Diagrams (several variables presented simul- taneously)

Transition matrices

Transition graphs or diagrams

Rinewalt, 1977

Penniman and Perry, 1976

Mittman and Dominick. 1973

Penniman, 1974

Penniman and Perry, 1976

Penniman and Perry, 1976

Kennedy, 1975

Kennedy, 1975

Dominick, 1975

Mittman and Dominick. 1973

Penniman and Ferry, 1976

Kennedy, 1975

Elliott, 1976

Penniman, 1975a

Penniman, 1975b

Penniman, 1974

Think time Test scores Figure of writ

Interactions per session Interactions per minute (mean, SD, median, nmde, etc.)

Query complexity

Number of sessions by session length Nuder of sessions by number of user interactions

Probability of session length in minutes for different data base types

Cumulative probability of session length

Data entry time per session versus days experience with system

Learning curve for terminal users based on mean data entry time over several days

Output words vs CP time for search execution

Query entry time vs query execution tim

CP time versus query complexity (no. of search terms)

Interaction vs session length

Level of attainment (user skill) vs dispersion of performance for tasks of different complexity

Number of users, response time. number of terminals, % inactive records, total records, average number of output lines, total number of in- quiries, average CPU time per query

Probability of next user action given previous action

Elapsed time prior to user taking next action given previous action

Most likely paths of next actions in a search session

Page 17: Monitoring and evaluation of on-line information system usage

Monitoring and Evaluation of on-line iflform~tion system usage 33

“profiles”. Another technique, transition matrices (P~~NIMAN, 1975), offers a useful means of presenting user behavior as a process and aliows different processes to be compared via statistical tests.

(c) Resources required for data analysis and presentation A number of capabilities are required in support of the monitor data collection software so

that analyses and graphic materials described in the previous sections can be prepared. A powerful data base management system with associated search, analysis and display

capabilities should be available to store and access the monitor data itself. To perform the text scanning for content analysis, a powerful string manipulator, character-

oriented language is required. SNOBOL or LISP are well-suited for such tasks and would also be useful in scanning state strings for the stochastic analysis.

Statistical analysis routines such as those available from SPSS (Statistical Package for the Social Sciences) are also necessary and should include likelihood tests for array comparisons (PE~~IMAN, 1975). A comprehensive interactive graphics plotting package should be available which includes dynamic axis scaling, text insertion and deletion, display of information associated with the plot and saving and recalling plots for generation of hard copy (DOMINICK and MITTMAN, 1973).

Array manipulation methods including sparse matrix methods should be available. Arrays should not be restricted to two or three dimensions, but should be expandable to accommodate stochastic analysis of higher orders (up to 10 or 12 steps). Hash coding techniques for storage of string data can also be useful where higher order analysis is desired.

While all of these requirements are not available in a single software package, they are available individually. This fragmentation merely reemphasizes the necessity for storing the monitor data in a DBMS capable of interfacing with external software.

S.CONCLUSIONSANDIMPLICATIONS FORFUTURERESEARCH ANDDEVELOPMENT

The review at the beginning of this paper indicates that a variety of techniques for studying the information system user are available. While some studies have been conducted using each of the methods identified, there is a need for much more quantitative analysis of the user, the system and user interaction with the system.

On-line monito~ng is proposed as a highly useful technique for studying, evaluating and improving systems and user/system interfaces. Objectives for monitoring can be multiple and include comparison of systems and/or data base structures, efficiency evaluation of system/data base interface, analysis of usage of the system and/or data base and analysis of user success/satisfaction.

Monitoring can be of use to the system administrator, data base administrator and the.user. For each type of monitoring purpose there is a minimal required set of parameters which should be measured. It is important to keep the end goal of the monitoring effort in mind when selecting the set of parameters to be measured.

Once the data are collected, efficient and convenient data reduction, analysis and presen- tation techniques are required. The theoretical approach to data collection and evaluation can be traditional (i.e. a priori hypothesis formulation) or what is termed “grounded” theory development (i.e. data-driven hypothesis formulation). Regardless of the method chosen, the end goal is to improve the system and the user’s comprehension of and interaction with that system.

While monitoring may require additional overhead (and therefore expense), it can be well justified provided a clear set of goals and objectives are established for use of the data collected. Exclusion of such components from an information system for the sake of “efficiency” is a serious design error. It is no different than designing a car with no windows, The driver can set the vehicle in motion, but has no idea where it is going, once underway. Feedback, as in all dynamic systems, is essential. Research is needed which uses the feedback available via a monitored on-line system to tailor intelligent human-like assistance to the user in a real-time fashion (see PENNIMAN and PERRY, 1976). Research is also needed to improve the procedures for system evaluation and enhancement (see ~MINICK, 1975). In addition to the research benefits of on-line monitoring, the operational benefits to data base operators are

Page 18: Monitoring and evaluation of on-line information system usage

34 W. D. P~NNIM.~N and W. D. DOMINICK

significant. An information system design is not complete without such a data collec- tion/evaluation component.

Ac~~~~~e~ge~eff~~-Portions of the work performed by Dr. PENNIMAN and described in this paper were supported by the Battelle Columbus Laboratories while he was a member of their Information Systems Section. He is currently continui~ this research at WLC, Inc.

Portions of the work performed by Dr. I)OMfNfCK and described in this paper were supported by the Nationdi Science Foundation and Northwestern University, Evanston, Illinois under NSF Grant No. DS176-19481 and by the National Science Foundation under NSF Grant No. SER77-06835.

REFERENCES

[l] R. L. ACKOFF, Towards a behavioral theory of communication. ~~~uge~enf Sci 1958,4(3), 218-234. [ZJ D. AMBROZY, On man-computer dialogue. Inf. f. bin-machine Studies 1971, 3, 357-383. [3] R. W. ASHBY, An introduction to Cybernetics, Wiley, New York. (1%3). [4] P. ATHERTON, K. H. COOK and J. KATZER, Free text retrieval evaluation. Syracuse University School

of Library Science. Syracuse, New York. 1972 July; Rep. RAK-TR-72-159. [5] Auerbach Corporation. DOD user needs study, phase 1, 1%5. 14 May; Final Tech. Rep. 1151-TR-3. [6] H. B. BACK, The design and evaluation of an interactive reference retrieval system for the manage-

ment sciences. Unpublished Ph.D. dissertation submitted to Carnegie Mellon University, May 1976. [?I R. F. BALES, How people interact in conferences. Scientijic Am. 1955,192,31-55. 181 D. BANNISTER and J. MAIR, The Ev~iuution of Pers~n~~ constructs, Academic Press, New York

(1968). [91 G. V. BARRETT, C. L. THORNTON and P. A. CABE, Human factors evaluation of a computer based

information storage and retrieval system. Human Factors 1968, M(4), 431-436. [IO] L. BORMAN and W. D. DOMINICK, Profile evaluation, research and modeling for science information

systems: A report on the development of a generalized evaluation methodology to study user interaction. FInal. Rep., NSF Grant No. DS176-19481; June 1978.

1111 J. H. CARLISLE, Man-computer interactive problem solving-relationships between user characteristics and interface complexity. New Haven, CT: Yale University; June 1974; Report to the Office of Naval Research; under Contract N~f~67-A-~7-~10. Doctoral dissertation.

[ 121 C. W. CHURCHMAN, The Systems Approach. Dell, New York (1968). [ 131 W. D. DOMINICK, System performance evaluation of interactive retrieval. In Personalized Data Base

Systems (Ed, by B. MITTMAN and L. BORMAN). Wiley, New York (1975). [14] W. D. DOMINICK, User interaction monitoring as a multi-level process. Proc. Sot. Inform. Sci. 1977,

14:63. I151 W. D. DOMINICK, [Ed.], The University of Southwestern Louisiana ~S~R~AS~~E~S working paper

series; Collection of reports to the National Science Foundation by the University of Southwestern Louisiana under NSF Grant No. SER77-06835; 1978.

[I61 W. D. DOMINICK and B. MITTMAN, Information retrieval system cost/performance analysis via interactive graphics. Proc. Am. Sot. Inform. Sci. 1973, 10, 63.

[17] W. D. DOMINICK and W. D. PENNIMAN, Interaction monitoring considerations within network-based information systems. Accepted for presentation at the Eighth Mid-Year Meeting of the American Society for information Science. To be held in Banff, Alberta, Canada; 16-20, May 1979.

[18] W. D. ~~1~1~~ and J. E. URBAN, Techniques for evaluating computer-based systems for numerical data management. Accepted for publication In Proc. of the 6th Int. C~~ATA Conf. Santa Flavia, Sicily. 22-25, May 1978.

[ 191 W. D. DOMINICK and J. E. URBAN, Application of a generalized evaluation methodology for analyzing user interactions with the MADAM System at the University of Southwestern Louisiana. University of Southwestern Louisiana, Computer Science Department, Lafayette, Louisiana. Sept. 1978. Tech. Rep. CMPS-78-6-3.

[20] M. E. DRLJMMOND, Jr. Evaluation and Measurement Techniques for Digital Computer Systems Prentice Hall, Englewood Cliffs, New Jersey (1973).

[21] R. W. ELLIOTT, Kiviat-graphs as a means for displaying performance data for on-line retrieval systems. I. ASIS 1976, 178-182.

[22] J. C. FLANAGAN, The critical incident technique. Psychological Bull. 1954, 51(4), 327-358. [23] B. G. GLASER and A. L. STRAUSS, The Discovery of Grounded Theory: Strategies for Qualitative

Research. Aldine, Chicago (1976). [24] D. GRAUPE, Identification of Systems, p. 276. Krieger, Huntington, New York (1976). (251 J. JAFFE and S. FELDSTEIN, Rhythms of Dialogue. Academic Press, New York (1970).

Page 19: Monitoring and evaluation of on-line information system usage

Monitoring and evaluation of on-line information system usage 35

126) J. KATZER, The development of a semantic differential to assess users’ attitudes towards an on-line interactive reference retrieval system. J. ASIS. 1972, 122-127.

[27] W. T. KELVIN, Popular Lectures and Addresses, 3 Vol. MacMillan, New York (1891-94). [28] T. C. S. KENNEDY, Some behavioral factors affecting the training of naive users of interactive

computer systems. Int. J. Man-Machine Studies 1975, 7, 817-834. [29] D. W. KING and E. C. BRYANT, The Eualuation of Information Services and Products. Information

Resources Press, Washington, D. C. (1971). [30] E. H. LEVINE, A generalized methodology for determining the correlation of user requirements to the

information system. Proc. of the 7th Mid-Year Meeting of the American Society for information Science, May 1978.

[31] H. C. LUCAS, The use of an interactive information storage and retrieval system in medical research. Commun. ACM. 1978, 21(3), 197-205.

[32] W. C. MANN, Man-machine communication research: Final report. Feb. 1977; Rep. ISIIRR-77-57 to the Advanced Research Projects Agency by the Information Sciences Institute under ARPA Order No. 2930.

[33] T. H. MARTIN, J. CARLISLE and S. TREU, The user interface for interactive bibliographic searching: An analysis of the attitudes of nineteen information scientists. J. ASK 1973, 24(2), 142-147.

[34] D. MEISTER and D. J. SULLIVAN, Evaluation of user reactions to a prototype on-line information retrieval system. 1967; Report to NASA by the Bunker-Ram0 Corporation under Contract No. NASA-1369, Rep. No. NASA CR-918.

[35] V. MELN~K, Man-machine interface: frustration. J. ASK 1972, 392-401. 1361 B. MITTMAN and W. D. DOMINICK, Developing monitoring techniques for an on-line information

retrieval system. Inform. Stor. Retr. 1973, 9(6), 297-307. [37] E. B. PARKER, Behavioral research in the development of a computer-based information system. In

Communication Among Scientists and Engineers (Ed. by C. E. NELSON and D. K. POLLACK). Heath Lexington Books, Lexington, Mass. (1970).

[38] D. M. S. PEACE and R. S. EASTERBY, The evaluation of user interaction with computer-based management information systems. Human Factors 1973, 15(2), 163-177.

[39] W. D. PENNIMAN, Rhythms of dialogue in human-computer interaction. Paper presented at the 37th Ann. Conf. of the AM. Sot. Inform. Sci.; October 1974. Available from author.

[40] W. D. PENNIMAN, Rhythms of dialogue in human-computer conversation. The Ohio State University, Columbus, Ohio. Unpublished Ph.D. disseratation, 1975.

1411 W. D. PENNIMAN, A stochastic process analysis of on-line user behavior. Proc. Am. Sot. Inform. Sci. 1975, 12, 147-148.

[42] W. D. PENNIMAN, Suggestions for systematic evaluation of on-line monitoring issues. Proc. Am. Sot. Inform. Sci. 1977, 14, 65.

[43] W. D. PENNIMAN and J. C. PERRY, Tempo of on-line user interaction. Compendium of papers. 5th Mid- Year Meeting of the Am. Sot. Inform. Sci. pp. 10-23, May 1976.

[441 W. G. POWERS, H. W. CUMMINGS and R. TALBOTT, The effects of prior computer exposure on man-machine computer anxiety. Presented at Int. Commun. Assoc. Ann. Meeting; 25-28 April 1973, Montreal, Canada.

[45] J. R. RINE:WALT, Feature evaluation of full-text information-retrieval system. On-Line Rev. 1977, l(l), 43-52.

1461 R. ROSENTHAL et al. The network measurement machine-a data collection device for measuring the performance and utilization of computer networks. NBS Tech. Note 912; April 1976.

[471 B. D. RUBEN and J. Y. KIM [Eds.], General Systems Theory and Human Communication. Hayden Books, Rochelle Park, New Jersey (1975).

[48] M. J. SEVEN et al., A study of user behavior in problem solving with an interactive computer. Report by RAND to NASA under contract NAS 12-2144; April 1971, Rep. No. R-513-NASA.

[49] C. SHANNON and W. WEAVER, The Mathematical Theory of Communication. University of Illinois Press, Urbana, Illinois (1972).

[50] L. SVOBOOOVA, Computer Performance Measurement and Evaluation Methods: Analysis and Ap- plications. Elsevier, New York (1976).

[51] J. E. URHAN and W. D. DOMINICK, Design and implementation considerations for monitoring and evaluating information systems. Proc. of the 15th Ann. Southeastern Regional ACM Conf. pp. 356-370; April 18-20 (1977).

[52] J. WANGER et al. Impact of on-line retrieval services: a survey of users 1974-75. System Development Corp., Santa Monica, California.

[53] G. M. WEINBERG, An Introduction to General Systems Thinking. Wiley, New York (1975). [54] N. WEINER, Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press

Cambridge, Mass. (1975).