Re‐appraising information seeking behaviour in a digital environment

Re-appraising informationseeking behaviour in a digital

environmentBouncers, checkers, returnees and

the likeDavid Nicholas, Paul Huntington, Peter Williams and

Tom DobrowolskiCiber (Centre for Information Behaviour and the Evaluation of Research),

Department of Information Science, City University, London, UK

Keywords Information management, Worldwide web, Digital communications,National Health Service

Abstract Collating data from a number of log and questionnaire studies conducted largely intothe use of a range of consumer health digital information platforms, Centre for InformationBehaviour and the Evaluation of Research (Ciber) researchers describe some new thoughts oncharacterising (and naming) information seeking behaviour in the digital environment, and in sodoing, suggest a new typology of digital users. The characteristic behaviour found is one ofbouncing in which users seldom penetrate a site to any depth, tend to visit a number of sites for anygiven information need and seldom return to sites they once visited. They tend to “feed” forinformation horizontally, and whether they search a site of not depends heavily on “digitalvisibility”, which in turn creates all the conditions for “bouncing”. The question whether this type ofinformation seeking represents a form of “dumbing down or up”, and what it all means forpublishers, librarians and information providers, who might be working on other, possibleoutdated usage paradigms, is discussed.

IntroductionFor the last three years ciber has been at the forefront of digital informationuser research. During this period we have accrued millions of digitalfingerprints from information platforms as diverse as the Web, touch screeninformation kiosks and digital interactive television, and from fields as diverseas health, newspapers, charities and scholarly journals. These fingerprints arethose of what we once called end-users, but now, more aptly, digitalinformation consumers in recognition of their large numbers, diversebackgrounds and economic power. We believe that the data we havecollected is unparalleled in terms of its size, breadth, currency and, especially,robustness. Thus, with log analysis it is has proved possible to monitor the useof a system by hundreds of thousands of people. This is a far cry from some ofthe assessment and monitoring methods of the recent past, which were basedon small and unrepresentative samples (a few dozen “tame and pliant” OPAC

The Emerald Research Register for this journal is available at The current issue and full text archive of this journal is available at

www.emeraldinsight.com/researchregister www.emeraldinsight.com/0022-0418.htm

JDOC60,1

24

Received 02 April 2003Revised 01 August 2003Accepted 08 August 2003

Journal of DocumentationVol. 60 No. 1, 2004pp. 24-43q Emerald Group Publishing Limited0022-0418DOI 10.1108/00220410410516635

users – students, academics and library users typically). Logs record use byeveryone who happened to engage with the system, there is no need to take asample. The great advantages of the digital logs are not simply their size andreach, although the dividend here is indeed a rich and unparalleled one. Just asimportant is the fact that they are a direct and immediately available record ofwhat people have done: not what they say they might, or would, do; not whatthey were prompted to say; not what they thought they did (the traditionaldomain of questionnaires and focus groups). This is especially important in anarea, like digital information use, where issues are complex and people are all toeasily shoehorned into answers manufactured by researchers.

We “place” and explain the data through the use of questionnaires,interviews and observation. Something we call “triangulation”.

All this means that we can provide big and rich pictures of informationseeking behaviour, probably, not seen before. Some of these pictures mightmake us reconsider what we have discovered from previous studies ofinformation seeking behaviour undertaken in not so sophisticated a digitalinformation environment and employing less robust, purely qualitativemethods. Certainly, some of the data we have found challenges conventionalwisdom. Indeed, we believe we are witnessing a paradigm shift, possibly shifts.

We must remember that what we are evaluating is not the use of a limitedchoice/option bibliographic system by intermediaries but the use of full-textconsumer systems, which offer massive choice and high levels of interactivity,by end-users of every possible ilk. The really pressing issue is surely todetermine what users do when they are given so much freedom and choice andhow this manifests itself in information seeking terms. This is, in part, what weset out to do here.

AimsThe aim of the paper is to characterise and categorise the information seekingbehaviour of the digital information consumer – the general public, who use arange of digital platforms and services to meet a whole range of work, homeand leisure information needs. By collating data from a number of log andquestionnaire studies conducted largely into the use of consumer healthinformation platforms (web and digital interactive television) on behalf of theDepartment of Health, we present some new ways of characterisinginformation seeking behaviour of large populations of digital informationusers, and in so doing create a new typology of the digital user. A secondaryaim of this paper is to show how this characterisation can be done using the setof techniques developed for this purpose by Ciber researchers.

Previous researchThe retrieval characteristics of web users proved a focus of interest for theearly researchers. Thus, Catledge and Pitkow (1995) attempted to characterise

Informationseeking

behaviour

25

browsing strategies of World-Wide Web users by means of log file analysis.The classified browsers into three categories:

(1) Search browsing; directed search; where the goal is known.

(2) General purpose browsing; consulting sources that have a highlikelihood of items of interest.

(3) Serendipitous browsing; purely random.

They also found that hyperlinks were the most popular way for users tonavigate and search for documents on the WWW.

Navarro-Prieto et al. (1999) interested themselves with cognitive strategies inWeb searching. They developed four tasks to study how search strategiesmight be affected by search experience and search task. They identified threedifferent general patterns of searching:

(1) Top-down strategy. A top-down strategy is when users search in ageneral area and then narrow down their search from the links provideduntil they find what they are looking for.

(2) Bottom-up strategy. The bottom-up strategy is when users look for thespecific keyword that they were provided with in the instructions. Thisstrategy was most often used by experienced participants, for thespecific fact-finding searches.

(3) Mixed strategy. Many of the participants used both of the abovestrategies in parallel, searching for required information at the same timein multiple windows. Some of them alternated strategies, having “both inmind” during their search. This strategy was only used by theexperienced participants

Hsieh-Yee (2001) in a broad ranging article on the topic of web searchbehaviour points out that researchers have identified user behaviour, searchtasks, system capabilities, and search outcomes as the most important factorsin determining information seeking behaviour. She presents a simple model toshow that information retrieval can be distilled into three major components:system content, system capabilities, and users’ personal and environmentalcharacteristics.

There are a whole host of surveys that point to the personal andenvironmental characteristics of the web user. For instance a recent report(Gallup for the European Union, 2002) conducted by Gallup Group on behalf ofthe European Union to discover how the Internet is used by the public tells usthat the “influence of personal profiles is striking with particularsocio-demographic characteristics clearly identifying the more avid Internetusers”:

. gender – men are more likely to use the Internet than women;

. age – a higher proportion of the young (15-24) group;

JDOC60,1

26

. education – those who have studied longer use the internet more thanthose who have completed their education at an earlier age;

. locality type – people living in metropolitan areas tend to use the Internetmore frequently;

. occupation – current employees most frequent users; and

. number of persons in the household – “there is an increasing relationshipbetween the size of the household and the personal use of the Internet”.

Little research, however, appears to have been undertaken on developing atypology of digital information users or use behaviour. What work that hasbeen done has employed log data, although even here, classifications havetended to be based on what surely must be thought of now as the rather datedcharacteristic of level of expertise (i.e. end-user v information professional), withother aspects of system usage being incidental to search behaviourcharacteristics and their degree of expertise. Holscher and Strube (2000), forexample, investigated web searching with a view to gaining a betterunderstanding of the types of knowledge structures that would be necessary forsearchers to possess to enable effective web-based information seeking. Effectsof web experience and domain-specific background knowledge wereinvestigated by a series of search tasks. Data were gathered by ProtocolAnalysis, whereby each user gave instructions to a proxy searcher who carriedout the operation. Successful search performance was found to require thecombination of both web navigating and subject specific expertise. Althoughthe authors didn’t choose to do so, it would have been possible and interesting todevelop a typology of users employing these two factors. Even dividing usersby just two levels – novice and expert – would have given four categories:

(1) subject and Web expert;

(2) subject and Web novice;

(3) subject expert and Web novice, and

(4) subject novice and Web expert.

Kurzke (1998) undertook a study that looked at creating user profiles bymonitoring the user and group activities when browsing WWW pages. Theaim of this project was to improve the retrieval process using an intelligentagent “WebAssist”. Again, no attempt was made to use the profiles tocategorise users. This is a pity as user profiles could have been established onthe important basis of user need.

Bruce (1998) focused on how satisfied Australian academics were when theyused the Internet to search for information. Data for user satisfaction wasobtained in terms of numerical magnitude estimates on a one to six categoryrating. These in themselves could have been used to determine user profiles.However, Bruce combined the findings from this part of the study withinformation about such end user characteristics as training, frequency of use

Informationseeking

behaviour

27

and expectation of success. No relationship was found between Internettraining and satisfaction, or between satisfaction and frequency of use.

Coming to studies that have attempted to categorise users, Brengman et al.(2003), have conducted several surveys in the USA and Belgium tocross-culturally validate the Internet shopper lifestyle scale. They haveidentified four online shopping segments (tentative shoppers, suspiciouslearners, shopping lovers, and business users) and four online non shoppingsegments (fearful browsers, positive technology muddlers, negative technologymuddlers, and adventurous browsers). They have profiled Internet shopperswith regards to “their Web-usage-related lifestyle, themes of Internet usage,Internet attitude, psychographic, and demographic characteristics”.

Perhaps the only previous study to look in-depth at use of an informationsystem in terms of different user characteristics or types has been thatconducted by Pomfrett (1999a, b) who, with colleagues, looked at academicjournal access by lecturing staff and students from an online database“SuperJournal”. The researchers were interested in what “features andfunctionality of electronic journals had real value to the academic community”(original emphasis). During the two years that SuperJournal was available, over2,500 users registered to use it, with their usage being recorded by computerlogs. Data were collected on 1,801 users who had registered by the end ofAugust 1998 as a sample, and a cluster analysis was undertaken using thefollowing variables:

. frequency of use;

. breadth of use, e.g. how many journals were used;

. depth of use, e.g. tables of contents, abstracts, full articles; and

. use of features.

The cluster analysis and further research identified nine different types ofusers. Clearly, non-users could not be considered, by nature of the datacollection, but users were divided into “repeat” and “non-repeat” groups. Fromthe resulting data users were characterised into various categories or types.Each “type” was then profiled using interviews and questionnaires to explainthe factual usage data.

Repeat users were found to cluster into five groups:

(1) Enthusiastic users. This group, as the name implies, used the servicefrequently, accessed a large number of journals, and generally viewedfull text articles. Most users in this category were social scientists.

(2) Vanilla users. This was a name coined, presumably, to describe asomewhat non-descript use pattern. The group “spread evenly across thejournal clusters (and) used a moderate number of journals with moderatefrequency, and viewed full text articles in just under two thirds of theirsessions”.

JDOC60,1

28

(3) Unfulfilled users.This group accessed only one or two journals, and usedthe system infrequently and only retrieved full text in 15 per cent ofsearch sessions. The research team consider that this group “appeared tobe using the service to check on one or two journals and were happy that,on most occasions, they did not to find anything they had to read”.

(4) Gapfillers.This group only accessed a few journals, but did so frequently,and viewed full text in about half of their search sessions. Science staffand students predominated in this group.

(5) Demand specific users. This group used a small number of journalsinfrequently, but requested full text in “a high percentage” of sessions –behaviour, which suggests that they already had bibliographicinformation. Typically, these users were in the Sciences, but therewere some in the Social Sciences.

Non-repeat users were found to cluster into three groups:

(1) Tourists. These users used the service, but never viewed a particularjournal. “They may have looked at SuperJournal from curiosity, orbecause the librarian suggested it”.

(2) Lost users. This group behaved like “enthusiastic users” in the firstmonth the service was available, but they did not return subsequently.The researchers interpret this activity as possibly having “a short-terminterest, such as a student project, and then had no further interest in thejournals”.

(3) Exploratory users. This group were infrequent users who seldomaccessed full text. The system was said to (probably) be of insufficient ofrelevance for their purposes.

Eason et al. (2000) also analysed the SuperJournal data, arriving at somewhatdifferent user groups. They retained the “enthusiastic user”, “lost users”,“exploratory users” and “tourists”, but their other classifications were “focusedregular”, “specialised occasional”, “restricted”, and “searchers”. The “focusedregular” users are similar to Pomfrett’s “gapfillers”, accessing a small numberof journals frequently. The “specialised occasional” users used the serviceinfrequently, but when they did so, accessed a high number of journals.“Restricted users” are similar to “demand specific”, looking a small number ofjournals infrequently, although their use of fulltext was not as extensive as the“demand specific” group. “Searchers” “applied search as the only or dominantmode of using Superjournal. . .most of themwere social scientists” (Eason et al.,2000)

The Superjournal analysis was said to be a “useful exercise to classify usersby type”, (Pomfrett, 1999a) which allowed researchers “to focus on what usersreally do, and explain it in terms of their individual situations”. The authorscaution, however, that “a variety of factors . . . influence . . . use, and the

Informationseeking

behaviour

29

patterns exhibit[ed] at different times and for different purposes”. Theyconclude, therefore, that “in order to understand user behaviour, it’s importantto think multi-dimensionally about them as individuals” (Pomfrett, 1999a).

MethodsThis paper draws on a number of log and questionnaire studies that have beenconducted in the health field for the Department of Health that shed light on theinformation seeking behaviour of the digital information consumer. None ofthese studies were conducted with the sole purpose of characterisinginformation seeking in the digital environment, although all attempted to dothis in one way or another as part of a broader brief to examine the impact ofdigital health information services on the consumer. In addition, we draw on astudy we are currently undertaking into digital journal use, which appears toecho some of our findings in the health field.

Log studiesData has been taken for this paper from five log studies which togetherrepresent the usage patterns of hundreds of thousands of users.

Transaction log files were analysed for two consumer health Web sites: NHSDirect Online and SurgeryDoor. Two studies of the SurgeryDoor Web site wereundertaken. First, an analysis of log files for the 12-month period October 2001to September 2002 (see Nicholas, 2003a, for more details on this study). Afurther analysis of the transaction logs for the site was undertaken forNovember 2000. The NHS Direct Online (NHS DO) Web logs were analysed forNovember 2000. (More details of this study can be found in Nicholas et al.,2002a.)

Despite the seemingly wide range of Web metrics available to plot theinformation seeking behaviour of the individual (see Nicholas et al., 1999 fordetails) a myriad of difficulties arise when generating reliable web metrics.More difficulties in fact than were faced by OPAC researchers, who never hadto deal with robots, floating identifiers and a lack of a log-off signal, forinstance. Three difficulties in particular are worth mentioning, useridentification, number of pages viewed and caching. Perhaps the mainproblem with log files is that of user identification. Web logs provide an IPidentification number to identify the user. However, the IP number cannot betraced back to an individual, only to a machine. And the extensive use of proxyservers and PPP connections mean that the IP address cannot be assumed torelate to use on a specific machine and use might also relate to a group of users,rather than an individual. This makes the tracking of return users, aparticularly powerful metric, difficult to compute, as a user may return with adifferent identifying IP number or two users may share the same IP number.See Nicholas et al. (2003a) for more on this problem and attempts to minimise it

JDOC60,1

30

Another major problem arises in connection with another key informationseeking metric – the number of screens/pages viewed by an individual. We callthis site penetration. Thus, Web sites may contain pages with a variety ofcontent on the same page and access is by internal links on that page. Thismakes site penetration difficult to calculate as the user downloads severalinformation pages with a single Web page download. Caching furthercomplicates the situation. This occurs when a page is saved on the user’scomputer and then further accesses are made to the cached page. In such casesno record of page use is made on the transaction log files and estimates of pagepenetration will be incorrect (underestimating it, in fact).

Transaction log files and were analysed for the Communicopia consumertext and video health interactive digital television (DiTV) channel available viaKingston Interactive Television (KIT) to about 10,000 households in the Hullarea. The logs covered the first six months of 2002. DiTV logs are similar toInternet logs but additional data can be collected, which make them moreprecise and accurate. In the case of KIT, each user is allocated a unique numberand this is recorded in the log files. Further pages are not cached and the logsprovide an accurate record of pages viewed.

Finally, the logs of a digital journal library (Emerald) were evaluated for aperiod of a month (June 2002). These logs, like those of DiTV, are generally moreprecise than web logs because of password access and absence of caching.

Questionnaire studiesThree questionnaire studies were mined for this paper: one of SurgeryDoorusers and two of digital interactive television users of the Living Health andBush Babies channels.

A questionnaire was hosted on the SurgeryDoor site(www.surgerydoor.co.uk)[1] for the month of November 2000, which the2,700 subscribers to the site’s newsletter were invited to complete. However,only one-third of respondents said they were subscribers to the newsletter,implying that two thirds were casual users and responded to the questionnaireas a result of their use of the site. People were made aware of the survey as aresult of a pop-up box on a number of the site’s pages. In total, 1,068 usersanswered the questionnaire or 5 per cent of the unique IP addresses that wererecorded as visiting the site in November 2000 (Nicholas et al., 2001).

A questionnaire study of users of the Communicopia health channel on theKingston Interactive Television Service was undertaken. The questionnaireasking people for information on their use patterns was sent to all potentialKIT subscribers in the Hull area in February 2002. In all this reached about10,000 households. Nearly 1,200 were returned.

A telephone survey of a sample of Sky television users who had used theBush Babies programme, available on Channel Health was commissioned. Arandom sample of 279 “Channel Health” users were contacted from this

Informationseeking

behaviour

31

database – 28 refused to take part in the survey, there were 251 completedinterviews. The interviews were carried out between the 19 and 22 of April 2002.

ResultsOn the whole the logs and questionnaires point to a seemingly shallow,promiscuous and dynamic form of information seeking behaviour, wherepeople never really delve deeply into any one site; move from site to site withsome alacrity, and only occasionally return to a site. We have termed this formof information seeking behaviour “bouncing” or “flicking”, because that is whatit looks like in the logs. Despite the volatility of information seeking it alsohighly patterned. The constantly changing browsing frames created by searchengines (partly caused by the constant stream of new sites entering the system)and the dynamic nature of sites themselves (disporting their wares inever-changing ways) plainly provides the conditions, which give birth to thisform of information seeking.

We will now consider what we believe to be the two key attributes ofbouncing – shallow searching (site penetration) and promiscuity (visitingmany sites and not returning to them). Clearly, these attributes are related toone another.

Site penetrationA typical website, like that of a newspaper or health service, might containhundreds, if not thousands of pages, however all our research shows thatduring a visit people are unlikely to view more than a very few of these pages.To illustrate this, Figure 1 shows the percentage distribution of page requestsmade in a session for the SurgeryDoor health web site covering the 12-monthperiod October 2001 to September 2002. During this period the site was visited

Figure 1.Percentage distributionof page views (grouped)in a session for theSurgeryDoor Web site

JDOC60,1

32

by 381,704 separate IP addresses and 3,680,453 pages were viewed (excludingdeclared robots). It shows how much use has been made of the site, how deeply(or shallowly) users penetrated. It also shows how active or busy (or, possibly,confused in the case of sessions where little is viewed) they were when online.Approximately three quarters (74 per cent) of all visits featured three or fewerpage views (and 43 per cent viewed only 1 page), 20 per cent saw between fourto ten pages, and 6 per cent saw ten pages. Of course, site architecture and thecaching of pages to the user’s machine create problems when it comes toestimating the number of pages viewed by a user on the Internet.

Figure 2 provides more detail on the brevity and shallowness of the user’svisit for two health information Web sites, by showing how many of the pagesviewed were actually information pages. It shows the distribution of pagesviewed by single session one-page users only. What we are interested in iswhether users arrived at an information page and hence pages have beenclassified into three groups: an information page, a menu page and the homedirectory. If the user uploaded either a menu or home directory page then theuser has not accessed an information page and has left without accessing anysignificant information from the site, other than negative information (i.e. itwas not relevant or useful to them). These users can be thought of as thestereotypical “bouncers” – they have bounced in and out of the site withouthaving accessed an information page.

Turning our attention more closely to these “bouncers”, 61 per cent of NHSDirect single session one-page users (accounting for 8.5 per cent of all sessionaccesses) and 33 per cent of SurgeryDoor users (13.9 per cent of all session

Figure 2.The distribution of pagesviewed by single session

one page users only

Informationseeking

behaviour

33

accesses) viewed the opening menu/home directory screen and left withoutaccessing any further pages (Figure 2). These users did not access aninformation page and have as it were voted with their feet and have left withoutfurther checking the site.

Clearly, there is a big difference between the two sites and the likelyexplanation is the “digital visibility” of each service. Digital visibility sayssomething about the positioning of the service within the electronicenvironment (Nicholas et al., 2002b). For the Web this visibility is partlydefined by the sites’ “visibility” on search engine directory sites like Yahoo andthe sites’ positioning on the list of sites returned by search engines in responseto a user entered search query. The NHS positioning on the returns was poor atthe time of our research. This was discovered by looking at referrer loginformation. It was found that nine of the top 20 search terms[2] usedincorporated NHS in the search expression. This seems to indicate that themost popular way of linking to the NHS Direct Online website via a searchengine was by typing NHS somewhere in the search expression. In the mainusers did not find the NHS by typing in their medical condition/problem butfound it by first realising that there was a NHS site. Hence, many users onlyfound the site by including “NHS” rather than a medical term within theirsearch expression (cite kiosk report). Searchers who did not include NHS intheir search expression were unlikely to be offered the NHS site within the firsttwo or three pages of “hits” returned by a search engine. Thus fewer usersarrived at the NHS site via a search engine since this was further down thereturned search engine list. People use this list to bounce from site to site; henceNHS DO attracted less bouncer hits compared to their commercial competitors(SurgeryDoor). Furthermore, when users arrived (those that included NHS intheir search term) they were “more” likely to arrive at a home directory pagerather than a content page.

This was additionally confirmed as SurgeryDoor was proportionally foundto have more home users. These users are thought to be more likely to use asearch engine to find (health) sites. SurgeryDoor had more home users becausethe NHS Site had poor digital visibility on search engine returns and people usethe list returned by search engines to bounce between sites. The NHS was justto far down that list for people to bounce between sites.

One implication of bouncing/flicking kind of information seeking behaviouris then that the home and individual page landed at play a very important roleand are crucial to whether or not someone decides to go on and view pageswithin the site or not. We have some research to show how important and it isagain this all relates to digital visibility. While evaluating the logs of the NHSDirect health channel on Kingston Interactive Television (Nicholas et al., 2002b)it became clear that the channel was losing viewers over the four-month periodin which it was shown. Furthermore, the decline was not a gradual one but wascharacterised by a number of big and abrupt falls and these falls coincided with

JDOC60,1

34

a number of changes to its positioning on the KIT service. At each change theservice became more remote from the home page and, consequently, lessvisible. It transpired that the major impact was on new customers. New userswere not coming through because of the increasing difficulty of finding theservice. However, those people who had found the service showed their tenacityby making more extensive use of the channel when they arrived.

Considering the evidence presented in this paper, it appears inescapable thatthe positioning of services within an electronic environment, be they pages onthe web, digital TV channels or on stand-alone computer terminals, is a vitalcomponent of usage. Content may still be king – but if that content cannot beaccessed its quality, relevance and presentation are as good as wasted.

PromiscuityPromiscuity results from consumers being provided with choice. In informationseeking terms it manifests itself in two ways. First, people visit a number ofsites to find what they want. Second, and this is related, they do not return tosites they once visited.

Number of sites visited. Figure 3 provides data on the number of sites visited,in this particular case, health sites. The logs do not furnish this type of data,instead, we use data from an online questionnaire hosted on the SurgeryDoorWeb site[1] for the month of November 2000. In total 1,068 users answered thequestionnaire, which represented 5 per cent of the 21,118 visitors (as denotedby unique IP addresses) to the site in November 2000. It shows that vastmajority of people (71 per cent) said they visited two or more sites, 29 per centvisited three to five sites and 11 per cent visited five or more. Clearly, those whoused just one site were heavily in the minority. And, of course, this is likely to

Figure 3.Number of health sites

visited

Informationseeking

behaviour

35

be an underestimation of the number of sites visited, as users were unlikely toremember sites, which they do not find useful or visited long ago and sincehave not returned.

A questionnaire study of health channel (Communicopia) viewers on theKingston Interactive Television Service pointed to the general informationseeking behaviour that results in people searching a number of sites in pursuitof information. Viewers of the service were asked their reasons for using it.Browsing for health information proved by far to be the most popular reasonover two-thirds (68 per cent) of users reported browsing as a reason for use(Figure 4).

This result was further backed-up by a study of users of a digital interactivetelevision programme on pregnancy (Bush Babies) on the Channel Healthservice on Sky TV. A total of 68 per cent of Bush babies respondents had justfound it by browsing while 14 per cent saw an on screen promotion (Figure 5).

Browsing, time and time again in our studies, has proved to be the mainmethod of obtaining information.

There is evidence from the SurgeryDoor survey to suggest that the youngerthe users the more likely they are to exhibit promiscuous behaviour. Thus,Figure 6 shows the distribution of sites visited by age. Of those aged under 34,40 per cent visited three or more sites compared to 41 per cent of those aged 35to 54 and 22 per cent of those aged 55 and over. Those aged 55 and over weremore likely to visit just one site.

The same survey provides another explanation of why so many sites arebeing visited. Respondents were asked to rate SurgeryDoor in regard tobreadth and depth of content and trust in the information. The number ofhealth sites visited was found to be correlated to a scoring over the threeattributes derived by factor analysis. Importantly, a relationship was found

Figure 4.For what purpose didyou use the service

JDOC60,1

36

between the respondent’s score in regard to content and the number of healthsites visited. As the number of health Web sites that the user visited increasedso the users rating of content depth, breadth and trust declined (Figure 7). Thissuggests that users who visit a number of sites are not so worried aboutcontent attributes of an individual site as these attributes are maximised byvisiting a number of site. Alternatively, they realised that all sites lack the

Figure 5.What made you first use

Bush Babies

Figure 6.Age of respondent

(grouped) and number ofhealth web sites visited

(grouped)

Informationseeking

behaviour

37

content attributes they require and that content attributes can only bemaximised by visiting many sites. What we might be seeing is the kind ofremote-flicking channel behaviour that children exhibit while watchingtelevision.

By comparing information seeking behaviour on different digital platformswe can get some further insights into promiscuity. Thus, take the case of healthinformation channels on digital interactive television where there is not so muchchoice and for which there are no search engines to stimulate source bounce.Instead, users are forced to move around pre-selected menus and individualpages. In such circumstances the question is whether a frenetic form ofbehaviour still manifest itself? In other words, would DiTV users flick betweenpages in way that Internet users flick between Internet sites. Our research hasshown that, indeed, this is the case. A total of 33 per cent of DiTV users viewed21 pages or more in a search session and 50 per cent viewed 11 pages or more.This is high volume use and is far in excess of that expected (compared to otherplatforms) and users must be viewing more pages than they would necessarilyneed in order to discover what they need. (Huntington et al., 2002)

Returnees. Coming back to a site constitutes conscious and directed use – asgood an approximation of this as you are likely to get from web logs. A servicewith a high percentage of returnees can be regarded as having a “brand”following, which is what all service providers ascribe for. All this makes returnvisits a powerful performance – and, possibly, quality – indicator. Theindustry calls this “site stickiness”. Loyalty, or repeat behaviour, however,

Figure 7.User scores on contentattributes (breadth,depth and trust) overhow many sites visitedfor health information

JDOC60,1

38

appears not to be a trait of the digital information consumer. A study of theSurgeryDoor web site (Nicholas et al., 2003b), which allowed for the workingsof proxy servers and floating IP addresses, found that over a relatively longperiod of 12 month two-thirds of visitors never returned, with 33 per centvisiting the site two to five times. Plainly, it is difficult to develop repeatbehaviour in these circumstances.

Even in the case of a relatively stable information environment, a digitallibrary (the Emerald journal database) albeit on a dynamic platform (the Web),we found evidence of the same phenomena (Nicholas et al., 2003a). In a study,albeit only covering a month, it was found that nearly nine in ten users visitedthe site just once in the survey period. Approximately 10 per cent of usersvisited between two to five times and half a per cent visited six or more times.And these were academics, who you might feel have an ingrained repeatbehaviour because of the current awareness function that typically comes withthe job, researchers certainly. We are working on a larger data set to see if thisis true over a longer time period.

Clearly, it seems consumers are casting aside the old in pursuit of the new.The massive churn rate of information sources combined with the scattergun(and serendipitous) characteristic of retrieval by search engines (foreverproviding different glimpses of the information environment) creates providethe very conditions for them to do this. In fact, even in the case of very practisedweb users there is anecdotal evidence to suggest that they set off on differentinformation tracks every time they search, jettison resources as they go andsoon lose track of where they have come from. There appears to be little of thebuild that was associated with traditional commercial online searching. Peoplestart from scratch every time they search. The new is exciting; discovery is thekey information seeking driver.

There is possibly another explanation – a media one, and that is the Webafter all owes far more to the media world than the retrieval or systems world.There are only a few films you would like to see again, and this goes forwebsites as well. This leads to an obvious difficulty in the e-commerce fieldwhere people are always looking for the next shop.

DiscussionNow let us now try and explain this kind of information seeking behaviour thatwe are beginning to sketch. As we have already suggested, what we areprobably seeing is what happens when people are presented with massive andincreasing choice, which they have to make themselves, and quickly. Thetraditional library-driven user of the not so distant past relied on the library for(limited) choice, and for that stamp of quality or authority. The assumptionbeing that if it was in the library it was good and, anyway, the choice waslargely made for you because the intermediary conducted the search. Today,most people search for themselves and search from non-library or evaluated

Informationseeking

behaviour

39

information environments, most obviously seen in the health field. Inconsequence, they are forced to make the evaluations once made by librarians,and with so much choice and new products coming on stream, they have tomake many, many evaluations. They largely do this with the help of a searchengine, on the basis of long-experience with searching the Web, practice inmaking constant comparisons and a process of trial and error. The phrase “weare all librarians now” is a particularly apt one.

We all know that this has happened but few us have the data to showwhat hasactually occurred as a result, and the logs and associated questionnaires show usthis all too clearly. People’s information seeking behaviour in these circumstancescan best described from the logs as one of flicking, bouncing or surfing. Thesepeople can also be viewed as consumer “checkers” or “evaluators”. Evaluation islargely undertaken bymaking comparisons. Evaluation is a key element of digitalliteracy. To stay afloat in the ever-expanding digital environment you need toevaluate, and evaluate well. Web provides huge opportunity to suck it and seeand this of course is a form of evaluation. Evaluation is not only made on the basisof content, but also on the basis of authority, access, design, currency andinteractivity, to name only the most important.

Information professionals viewing such behaviour should not be mislead intobelieving that this is a dumbed down form of information searching and retrieval– that people cannot make up their minds or that they are obtaining just a thinveneer of information. One is minded of the father watching his young daughterwho is using the remote to flick from one television channel to another. A slightlyirritated father asks his daughter why she cannot make up her mind and sheanswers that she is not attempting to make up her mind but is watching all thechannels. She, like our bouncers, is gathering information horizontally, notvertically. The single authoritative source, which is always returned to and deeplymined, seems to be a thing of the past. Loyalty might be a thing of the past. AsSood (2002) points out locking in customers becomes more difficult and believesthat is it likely that, before too long, the adage “the customer is king” becomestransformed into “the promiscuous customer is king”?

Choice, combined with digital visibility, is causing a huge volatility in amarket, which not so long ago was know for its unchanging or slowly changingnature. The information world appears to have been turned upside down. Thishas been brought home to us in no uncertain terms in our very latest workconnected to digital libraries and scholarly journals (unpublished). Whencomparing the number of downloads made in 2001 and 2002 by 91 universitiesto a digital journal library what strikes you most is the sheer volatility andunpredictability of the data rises and falls in the order of 50 per cent of more arecommonplace (Table I)[3]. There appears to be no such thing as slow growth, orindeed, slow decline. In such circumstances there is no choice if you want tounderstand information seeking behaviour but to monitor the logs andtriangulate this data.

JDOC60,1

40

ConclusionsWhat we have done in this article is to present some new ways of determiningand characterising information seeking behaviour in the digital environment.As a result, we have highlighted a form of information seeking behaviour bornout of the massive choice that is on offer to the information seeker in themodern digital environment. This type of behaviour should make informationproviders of all types question their assumptions as to how they think theend-user behaves, because there is a sense that some providers are working onmodels born of a different age. This is then not the time to build dedicated,“accredited” systems even if they boast perfect content; nor, probably,gateways. How can the digital information consumer recognise that the sourceis perfect if they do not have access to other ones to compare and benchmark –there is a need for lesser or irrelevant ones. Far better to buy sponsored links inGoogle, and use other forms of Internet advertising.

We have used the consumer health field and two digital platforms (web anddigital interactive television) to illustrate how this can be best done for largepopulations of users. However, the techniques are valid for all fields. Webelieve our work is unique in that it deals with more than one digital platformand that we have been dealing with the general public. However, our work canbe seen to build on the categorisation of digital users that the SuperJournalstudy attempted in an academic and a very-much test-environment. We havebuilt our models on the back of day-to-day use on open and public systems. Italso updates some of the ideas promulgated in the study – the researchconducted for SuperJournal took place around five years ago.

Not surprisingly then, information seeking behaviour follows thearchitecture of distributed information sources on the Internet. The Internetis not only a major piece of technology, it has give birth to a cognitive model ofobtaining knowledge (largely one of trial and error). Traditional informationservices are generally recognised as being static, and electronic services seen asdynamic. Now we see what the migration from traditional to electronic sources

Movement (%) % change

IncreaseMore than 100 11+75-99 2+50-74 3+25-49 16+1-24 29Decrease-1-24 24-25-49 10-50-74 3-75-99 1Average 17

Table I.Volatility in digital

journal use: per centchange in use by

universities over a oneyear period (2001/2002

to 2002/2003)

Informationseeking

behaviour

41

has meant in information seeking terms. We are all bouncers and flickers, andthe success of Google is a testament to that, with its marvellous ability toenhance and amplify this flicking and bouncing (like a really good remote). Theanalysis of the searching behaviour of digital consumers tells us much morethan that, it also shows us how people develop knowledge. The digitalinformation consumer looks for opinions before they make judgements; theywant to make judgements themselves, not only follow that of so-calledauthorities (and this has tremendous implications for the concept ofinformation quality). In the past, information seeking was seen to be the firststep to creating knowledge. Now, it is no longer just a first step, it is acontinuous process. Today, we probably ask the same question continuously,but of course against a constantly changing and evolving source list.

Notes

1. 3W Marketing Ltd, were responsible for gathering the information, but not its analysis.

2. The top 20 accounted for 60 per cent of all terms useds to find the main server site.

3. Based on downloads in the periods: August 2001/January 2002 and August 2002/January2003 by 91 universities

References

Brengman, M., Geuens, M., Weijters, B., Smith, S.M. and Swinyard, W.R. (2003), “SegmentingInternet shoppers based on their Web-usage-related lifestyle: a cross-cultural validation”,Journal of Business Research (in press).

Bruce, H. (1998), “User satisfaction with information seeking on the Internet”, Journal of theAmerican Society for Information Science, Vol. 49 No. 6, pp. 541-56.

Catledge, L.D. and Pitkow, J.E. (1995), “Characterizing browsing strategies in the World-wideWeb”, Computer Networks and ISDN Systems, Vol. 27, pp. 1065-73, available at:www.igd.fhg.de/www/www95/papers/80/userpatterns/UserPatterns.Paper4.formatted.html.

Eason, K., Richardson, S. and Yu, L. (2000), “Patterns of use of electronic journals”, Journal ofDocumentation, Vol. 56 No. 5, pp. 477-504.

Gallup for the European Union (2003), FLASH Eurobarometer No. 135: Internet and the Public atLarge, Vol. 31, available at: http://europa.eu.int/comm/public_opinion/flash/fl135_en.pdf(accessed 31st July 2003).

Holscher, C. and Strube, G. (2000), “Web search behavior of Internet experts and newbies”,Computer Networks, Vol. 33 No. 1-6, pp. 337-46.

Hsieh-Yee, I. (2001), “Research on Web search behavior”, Library & Information ScienceResearch, Vol. 23, pp. 167-85.

Huntington, P., Nicholas, D., Williams, P. and Gunter, B. (2002), “Comparing two digitalconsumer health television services using transactional log analysis”, Informatics inPrimary Care, Vol. 10, pp. 147-59.

Kurzke, C., Galle, M. and Bathelt, M. (1998), “WebAssist: a user profile specific informationretrieval assistant”, WWW7/Computer Networks, Vol. 30 No. 1-7, pp. 654-5.

Navarro-Prieto, R., Scaife, M. and Rogers, Y. (1999), Cognitive Strategies in Web Searching,available at: http://zing.ncsl.nist.gov/hfweb/proceedings/navarro-prieto/index.html

JDOC60,1

42

Nicholas, D., Huntington, P. and Williams, P. (2002a), “Evaluating metrics for comparing the useof web sites: case study two consumer health web sites”, Journal of Information Scienc,Vol. 28 No. 1, pp. 63-75.

Nicholas, D., Huntington, P. and Watkinson, A. (2003a), “Digital journals, big deals and onlinesearching behaviour: a pilot study”, Aslib Proceedings, January/February (in press).

Nicholas, D., Huntington, P. and Williams, P. (2003b), “Micro-mining log files: a method forenriching the data yield from Internet log files”, Journal of Information Science,No. Summer (in press).

Nicholas, D., Huntington, P., Williams, P. and Gunter, B. (2001), “Delivering consumer healthinformation digitally: platform comparisons”, paper presented at the International OnlineConference, Olympia December 2001. Learned Information Limited, Oxford.

Nicholas, D., Huntington, P., Williams, P. and Gunter, B. (2002b), “Digital visibility: menuprominence and its impact on use of the NHS Direct information channel on KingstonInteractive Television”, Aslib Proceedings, Vol. 54 No. 4, pp. 213-21.

Nicholas, D., Huntington, P., Williams, P., Lievesley, N. and Withey, R. (1999), “Developing andtesting methods to determine the use of web sites: case study newspapers”, AslibProceedings, Vol. 51 No. 5, pp. 144-54.

Pomfrett, S. (1999a), “Types of electronic journal users”, SuperJournal, available at: http://irwell.mimas.ac.uk/sj/confpomfret.htm

Promfrett, S. (1999b), “Home page”, paper presented at the Superjournal Conference, BirkbeckCollege, 21 April, available at: http://irwell.mimas.ac.uk/sj/confpomfret.htm

Sood, R. (2002), “Making promiscuity pay”, available at: www.thestreet.com/pf/comment/connectingdots/1014698.html (accessed 11 February 2002).

Further reading

The Digital Divide (n.d.), available at: www.brookes.ac.uk/schools/apm/publishing/culture/etext/divide.htm#1 (accessed 31 July 2003).

Informationseeking

behaviour

43

Documents

Re‐appraising information seeking behaviour in a digital environment