The Searchmaster's Toolbox - David Hawking, Funnelback Search

Preview:

DESCRIPTION

David Hawking, pre-eminent information retrieval researcher and Funnelback's Chief Scientist, gave this talk on the need for a Search Master within all but the smallest organisations at a Funnelback Seminar in London on March 31st, 2010. Even if there isn't an individual with that specific job title, the responsibility for maintaining, improving and monitoring search needs to be prioritised and clearly assigned. David's presentation covers the reasons why search is so vitally important and the tools which can improve search results.

Citation preview

The SearchMaster's Toolbox

Westminster Abbey 31 March 2010

David Hawking

Funnelback OverviewHosted (SaaS) search

Web sites Portals

Enterprise search Intranets/Web Databases File shares Corporate repositories

(e.g. Lotus, TRIM, SharePoint and others)

Search Applications Geospatial Plagiarism detection E-commerce

Contracted Development & Professional Services

UK Customers• From pre-2009: Staffordshire University,

Scottish Care Commission• From 2009:The Electoral Commission,

Digital UK, Hargreaves Lansdown• From 2010: London School of Economics

and Political Science, Incisive Media, British Medical Journal, East Ayrshire Council, ...

First UK Customer - 2004

 

Funnelback Cloud

Instant scalability and redundancy

A fully managed search service that’s instantly scalable and redundant through our specialised search cloud environment.  Funnelback Cloud can be deployed in days, not months. For web sites and online catalogues, complete control over the look and feel allows a seamless experience for your users

Dedicated and fully managed

A fully managed search service hosted on dedicated hardware that’s especially architected to meet the most challenging search environments. For web sites and online catalogues, complete control over the look and feel allows a seamless experience for your users

Self managed

Complete control over the Funnelback search software on a self-managed Windows or Linux platform. Our search experts can help your organisation search across web sites, intranets, database, file shares, SharePoint sites, TRIM, staff directories and other repositories in a single query.

How we deliver searchFunnelback Enterprise

Funnelback Hosted

ABC

“We have been extremely happy with the product and the overall service. We have significant search volume and Funnelback has stood up to the test”

Mary Ramsay National Manager Interactive Services 

Funnelback Enterprise

XML

RelationalDatabases

e.g. Oracle, MS SQL

Web SitesShared Drives

G:

Intranet

Images

CorporateRepositories

e.g. Lotus

TRIM

SharePoint

“Funnelback has taken our complexinternal information environment andmade it accessible”

Mike Swanson, OxfamKnowledge and Information Services Team Leader  

“Search is life”

Costs of poor search• Butler Group: Up to 10% of salary costs

wasted through ineffective search• IDC: A company with 1000 information

workers can expect to waste more than $5M p.a. due to poor search

• Accenture: A survey of 1000 middle managers spend as long as 2 hrs/day searching for information.

Who's the SearchMaster in your organisation?

Stakeholders expect every SearchMaster to do her duty!

• To make external website search work– Sales conversions– Information dissemination– Reduced inquiry handling load

• To provide effective search of corporate information– Happy, productive employees (plus students

and other stakeholders)

Give them the tools and they will do the job!

• Searchmaster• End-user

• Simple• Powerful

1. The basic search tool• Coverage and scale• Quick and easy to install• Good out-of-the-box performance

– It can find the answers people want

• Simple to configure– Avoid features which are too complex to use or

set up.

Time to set up Funnelback• As little as a couple of minutes to install the

product.• As little as a couple of minutes to create a

collection and start updating• A couple of hours to customise look and

feel• A few minutes to activate contextual

navigation, faceting and featured pages.

Perceived Funnelback advantages

• Tried and tested algorithm (20 yrs dev.t)• Control from admin interface• Customisability of business logic / Open

APIs• Flexible result presentation• Professional services model• Open published price list

2. FineTuning• Every search deployment is different

– Web, database, fileshare, Lotus

• The weighting of ranking features must accommodate to the differences

• Manual tweaking is fraught with danger– Fix one query, break a dozen

• Make a test file and use Funnelback FineTune

Spreadsheet testfileemployment health.gov.au/health-career-vacant.htm

jobs health.gov.au/health-career-vacant.htm

vacancies health.gov.au/health-career-vacant.htm

recruitment health.gov.au/health-career-vacant.htm

tenders health.gov.au/list-of-tenders-and-grants.htm

grants health.gov.au/list-of-tenders-and-grants.htm

tenders health.gov.au/list-of-tenders-and-grants.htm

mental health health.gov.au/mental-health-and-wellbeing

mental health strategyhealth.gov.au/mental-strategy

aged care health.gov.au/aged-care.htm

Testfile Desiderata• Representative of real workload

– Unbiased sample– Good estimate of actual performance– Best tuning

• Many queries (e.g. > 100)• Multiple weighted answers (where

applicable)

Break to Demo of FineTuning(based on popular/important queries and data

from LSE)

Out of boxAs configured

-daat (tuned)-daat20000 (tuned)

-daat0/TAAT (tuned)

0

5

10

15

20

25

30

Other LSE Testfiles• Click logs – Right answers are those that

users clicked– Possible to tune to 0% failure rate (unrealistic)

• From keymatches file created for previous search technology(it had 709 rules)– Failure rate is higher (unrealistic)

• Random sample is the preferred approach

FineTuning Summary• Tuning 38 dimensions• Millions of query executions• Achieves substantial gains• Contact support@funnelback.com

But why do some queries still fail?

• Misspelled– Westminister Abbie

• Query words don't match language– “door” or “MOPEM” v. “manually operated

personnel egress mechanism”

• There is no answer to that question.– Maybe there should be– Scope issue?

• How information is published.

Need more tools!

3. Spelling suggestion tools• Suggestions may be useful even if words

are correctly spelled:– Carlton furball club → Carlton football club

• Suggestions based on whole query, not word-by-word

• Don't suggest queries which make no sense in the collection being searched

• Autocompletion: Guide users to the best query

• Context is king

4. Query expansion tools• Manual rules:

– Rego → [registration rego]– MOPEM →[“manually operated personnel

egress mechanism door”]

• Related queries (automatic)– Based on co-clicking

• Contextual navigation (on-the-fly)– Finding superphrases in a deep result set

• Faceting (semi-automatic)

Related queries

Tools to tell you when you need to add, change or reorganize content.

5. Reporting Tools• Queries that produced no results• Click patterns that suggest user wasn't

happy• Pattern analysis – query spike

Reporting

Query Spike - LPG

Pattern Analyser

Pattern Analyser - Timeplot

Conclusions• Search is important

• Organisations benefit when someone takes responsibility for effective search – the SearchMaster.

• A good search tool provides a full kit of simple, effective tools to help the SearchMaster achieve just that.

Recommended