Upload
kory-stone
View
218
Download
0
Embed Size (px)
Citation preview
Crunching Numbers: OPAC Log Analysis of WebVoyage
Bennett Claire PonsfordDigital Services Librarian
Texas A&M University Libraries
EndUser 2007 Session 34
Overview
Why analyze your log files? How to do it What we found The changes we made What do the latest logs say? Improvements needed
Why Analyze? To see how your users search when
you’re not watching To resolve internal disagreements
over default searches, limits, etc. To see whether changes to
WebVoyage really improved search results
As a counterpoint to task-based user testing
C.S. Lewis Lewis, C. S. (Clive Staples) LION WITCH? LION, WITCH? LION, WITCH, AND WARDROBE? Lewis, C. S. (Clive Staples)
Issues to Think About
Does Voyager capture the data you need? Privacy concerns
Does your network organize data the way you need? Staff vs. public IP addresses
Do you want all searches or a sample?
How To
Read the documentation Technical Manual, Chapter 15, Popacjob
Begin logging your data Extract data into Access database
Clean up data as needed Run queries Scratch head and contact Tech
Support
5-MAR-07 WebOpac 20061016102711 Title keyword(TKEY New) AND (TKEY York) AND(TKEY Times) Y West Campus Library K YN 7 1 W999.999.99.999AMDB20020820112825 N
Data Fields
Search_date
5-MAR-07
Stat_string WebOpacSession_id 20070305123912Search_type Title keywordSearch_string
(TKEY New) AND (TKEY York) AND (TKEY Times)
Data Fields (cont.)
Limit_flag YLimit_string
West Campus Library
Index_type KRelevance YHyperlink NHits 7
Data Fields (cont.)
Search_tab 1Client_type WClient_ip XXX.XXX.XX.XXXDbkey AMDB20020820112825Redirect Flag
N
SQL for Count of Search TypeSELECT [Spring 2007 OPAC log].Search_type,
Count([Spring 2007 OPAC log].Search_type) AS CountOfSearch_type
FROM [Spring 2007 OPAC log]WHERE ((([Spring 2007 OPAC
log].Client_type)="W") AND (([Spring 2007 OPAC log].Search_tab)="1") AND (([Spring 2007 OPAC log].Hyperlink)=“N")AND (([Spring 2007 OPAC log].Client_ip) Not Like "128.194.8[4-7].*" And ([Spring 2007 OPAC log].Client_ip) Not Like "165.91.39.*"))
GROUP BY [Spring 2007 OPAC log].Search_typeORDER BY [Spring 2007 OPAC log].Search_type;
Results Author Browse – 334 Author headings – 943 Author keyword – 1587 Builder – 6 Call Number Browse – 777 Command – 16 Documents Call Number –
15 Expert keyword (rel) – 5 Expert keyword (rev) –
2378 Journal title keyword – 793 Journal title – 1375
Keyword – 82 Keyword (Relevance) --
6179 Keyword Search – 1574 LC Call Number browse --
14 Locally Assigned Call#-- 2 Simple Search -- 58 Subject Browse -- 851 Subject headings -- 128 Subject Hds keyword -- 251 Title keyword -- 2045 Title Redirect -- 615 Title starts with – 1677
September 2006 (Voyager 5)
Changed interface Defaults
Kept Tab at Simple Search Changed Search to Keyword (CMD*
with javascript) Changed result sort to by relevance
Fall 2006 Preparing to upgrade to Voyager
6.1 New keyword searches with ^ to
automatically AND words together Some people unhappy with recent
changes Default search Search results sort order
Decided to look at the data
Decisions upgrading to V6
Basic data Where are our searchers What search tab are they using How are they searching
Default search Order of title searches Simple limits
What Search Tab Used?
0%
20%
40%
60%
80%
100%
Simple (1) Keyword (akaBuilder - 2)
Course Reserves(3)
Inside Libraries Outside Libraries
Default Search: Discussion
Title search (TALL) What we traditionally had used Reference’s preference
General keyword search (new GKEY^*) What users are used to in a Google
world More forgiving search
First Title Search: Discussion
Left anchored title (TALL) Preferred by Reference
Title keyword (new TKEY^*) More forgiving
Analysis of Voyager 6 Logs
Search frequencies No hit frequencies Title search problems Journal title search hits Search limits
Keyword and Subject Searches
0%
10%
20%
30%
Expertkeyword
Keyword Subjectbrowse
Subjectkeyword
Inside Libraries Outside Libraries
Author Searches
0.0%
2.0%
4.0%
6.0%
8.0%
AuthorBrowse
Authorheadingsbrowse
Authorkeyword
Inside Libraries Outside Libraries
Title Searches
0%
10%
20%
30%
Journalkeyword
Journaltitle
Titlekeyword
TitleRedirectKeyword
Title
Inside Libraries Outside Libraries
No Hit Title Searches: Do We Own Them?
0%
10%
20%
30%
40%
50%
Yes No Unable toverify
Other
Keyword Left-Anchored (preliminary)
Location Limits Used
0
100
200
300
400
500
600Media Services
Qatar
Cushing
Bestsellers
West Campus
Reference Coll.
Curriculum Coll.
Web Resources
No Hit Percentages Some
improvement but no major change
0%
5%
10%
15%
20%
25%
30%
Summer 2006 Fall 2006 Spring 2007
Improvements: Spelling Spellchecking Automatic searching of variant
spellings “&” or “and” British vs. American spellings Numbers Abbreviations
Did you mean? Suggestions based on field
Improvements: Help
More granular no hits help Specific search types Any search with “conference” or
“proceedings” in it Journal title searches including “vol.”,
“no.”, or a number Searches with more than 4 or 5 words
More granular help for too many hits
Improvements: Specific Searches
Keyword searches Automatic stemming Ignore punctuation and spacing Ignore stop words
Title searches Ignore initial article
Journal search results layout
Whether to include the index field in the journal title search results Same search results but the order of
the results is change by the inclusion of the index field
Primarily a problem for single word titles that retrieve more than 1 screen of results – Science, Nature, etc.
JALL Search ResultsResults
Count Percent Count Percent
No hits 252 32.5% 449 32.7%1 hit 144 18.6% 203 14.8%2-5 hits 264 34.0% 557 40.6%5-50 hits 76 9.8% 126 9.2%51+ hits 40 5.2% 37 2.7%
Total 776 1372
Inside Library Outside Library
What Next?
Continued analysis of searches with no hits
Analysis of search repair strategies Word counts
More Information Jansen, Bernard J. “Search log analysis:
What it is, what’s been done, how to do it,” Library & Information Science Research, 28 (2006) 407-432.
Yu, Holly and Margo Young, “The impact of Web search engines on subject searching in OPAC”, Information Technology and Libraries, 23 (2004) 168-180.