Upload
squiz
View
1.319
Download
1
Tags:
Embed Size (px)
DESCRIPTION
David Hawking, Funnelback's Chief Scientist, presented "The Search Master's Toolbox" at Online Information 2010 in London.The talk provided considerations and advice for website and marketing managers to apply to search solutions employed in their organisations. It highlighted the reasons why search is so vitally important to the overall success of a website and provided information on the tools required to deliver and optimise an effective search solution.
Citation preview
Online InformationLondon 30 Nov 2010
The Search Master's Toolbox
David HawkingFunnelback / Squiz
Funnelback’s UK Customers From 2004/5: Staffordshire University,
Scottish Care Commission
From 2009:The Electoral Commission, Digital UK, Hargreaves Lansdown
From 2010: LSE, The Electoral Commission, Incisive Media, British Medical Journal, East Ayrshire Council, Skype international, UCL, ...
“Search is life”
Costs of poor search Butler Group: Up to 10% of salary costs wasted
through ineffective search IDC: A company with 1000 information workers
wastes more than $5M p.a. due to poor search Accenture: Survey of 1000 middle mgrs show
they spend up to 2 hrs/day searching. Econsultancy: Only 41% of companies satisfied
that their site search is delivering on business objectives.
ABC Shop: 24% increase in online sales after upgrade in search effectiveness
Search is a critical part of the web experience.
Who's the SearchMaster in your organisation?
Stakeholders expect every SearchMaster to do her duty! To make external website search work
◦Sales conversions◦Information dissemination◦Reduced inquiry handling load
To provide effective search of corporate information◦Happy, productive employees (plus
students and other stakeholders)
Give them the tools and they will do the job!
SearchmasterEnd-user
SimplePowerful
1. The basic search tool Should:
◦Have good performance out of the box, without weeks of implementation.
◦Be simple to configure◦Avoid features which are too complex to
use or set up.◦Be able to cover your content and scale to
the necessary level
2. FineTuner Every search deployment is different
◦Web, database, fileshare, Lotus The weighting of ranking features must
accommodate to the differences Manual tweaking is fraught with danger
◦Fix one query, break a dozen Make a test file and use a tuning tool to
learn feature weightings
Testfile Desiderata Representative of real workload
◦Need an unbiased sample Many queries (typically >> 100) Identify the best answer(s) Equivalent answers See es.csiro.au/C-TEST/
Spreadsheet testfile
employment health.gov.au/health-career-vacant.htm
jobs health.gov.au/health-career-vacant.htm
vacancies health.gov.au/health-career-vacant.htm
recruitment health.gov.au/health-career-vacant.htm
tenders health.gov.au/list-of-tenders-and-grants.htm
grants health.gov.au/list-of-tenders-and-grants.htm
tenders health.gov.au/list-of-tenders-and-grants.htm
mental health health.gov.au/mental-health-and-wellbeing
mental health strategyhealth.gov.au/mental-strategy
aged care health.gov.au/aged-care.htm
Sources of testfiles at LSE A-Z Sitemap (>500 entries)
◦ Biased toward anchortext Keymatches file (>500 entries)
◦ Pessimistic Click data (>250 queries with > t clicks)
◦ Biased toward clicks – 100% success! Pop/crit queries (134 manually judged)
All biased – Use a sampling tool!
Dimension-at-a-time tuning
12
3
dim2dim1
dim1
Popular/Critical Set
Out of boxAs configured
-daat (tuned)-daat20000 (tuned)
-daat0/TAAT (tuned)
0
5
10
15
20
25
30
Fine Tuning Summary Tuning a large number of dimensions
(Funnelback FineTune covers 38) Millions of query executions Achieves substantial gains
But why do queries still fail?
Misspelled◦Onlion Imformation
Query words don't match document◦“door” or “MOPEM” v. “manually operated
personnel egress mechanism” There is no answer to that question.
◦Maybe there should be◦Scope issues.
Need more tools!
3. Spelling suggestion tools
Suggestions may be useful even if words are correctly spelled:◦ Manchester Untied → Chelsea
Suggestions based on whole query, not word-by-word
Don't suggest queries which make no sense in the collection being searched
Autocompletion: Guide users to the best query
Context is king
4. Query expansion tools Manual rules:
◦Rego → [registration rego]◦MOPEM →[“manually operated personnel
egress mechanism door”] Related queries (automatic)
◦Based on co-clicking Contextual navigation (on-the-fly)
◦Finding superphrases in a deep result set Faceting (semi-automatic)
5. Reporting and alerting tools
Reporting on Queries which:◦Produced no results◦Logged behaviour suggestive of unfulfilment
Alerting when:◦Submissions of a query (or group of related
queries) sharply increase in frequency For:
◦business intelligence◦Triggering creation or changes to content
Query spike alerting
Conclusions Search is important Organisations benefit when someone takes
responsibility for effective search – the SearchMaster.
The core search tool must be effective, and able to be adapted to your organisation's publishing and searching characteristics.
Further tools are needed to overcome poor queries and missing content.
Thanks to Mike Swanson of Oxfam Australia for the Ned Kelly line.