31
How We Incrementally Improved Search Ravi Mynampaty @ravimynampaty

How We Incrementally Improved Search

Embed Size (px)

DESCRIPTION

An oldie but a goldie. Some overlap with a diff preso but this is a standalone.

Citation preview

Page 1: How We Incrementally Improved Search

How We Incrementally Improved

Search

Ravi Mynampaty @ravimynampaty

Page 2: How We Incrementally Improved Search

Agenda

Background

• Roadmap

• Implementation

• Analytics

• Benefits

• Challenges

• Next Steps

Page 3: How We Incrementally Improved Search

Background: A few years ago…

• Out-of-the-box Ultraseek

• No optimization, no customization

• Fraction of HBS content indexed / searchable

• Many dead ends

• Proliferation of different search tools

• User sentiment

• “search sucks”

• “why can’t it be more like Google”

Page 4: How We Incrementally Improved Search

Background: Our Vision

• One Search Box to Rule Them All

• The long term goal: enterprise search

• One-stop searching

• Google-like simplicity

• Handle refinement / navigation on results page

Page 5: How We Incrementally Improved Search

Agenda

• Background

Roadmap

• Implementation

• Analytics

• Benefits

• Challenges

• Next Steps

Page 6: How We Incrementally Improved Search

Roadmap: Preliminary Steps

• Inventory document collections

• Inventory search-type tools

• Of the above, identify

– most heavily used

– strategically significant

– high impact

– Low Hanging Fruit (LHF)

Page 7: How We Incrementally Improved Search

Roadmap: Implementation Plan

• Prioritize tasks by ease of content access

and implementation (LHF)

• Develop timeline

• Build prototypes and iterate the design

Page 8: How We Incrementally Improved Search

Agenda

• Background

• Roadmap

Implementation

• Analytics

• Benefits

• Challenges

• Next Steps

Page 9: How We Incrementally Improved Search

Implementation: How we built it

• Customized Ultraseek’s results display code

• Worked with owners of software apps

–Provided JSON APIs

–Allowed us to spider their app/repository

• HTML is the API !!

• In other words:

No rocket science involved

Page 10: How We Incrementally Improved Search

Implementation: Three Integration Approaches

• Blended Search (e.g., Faculty/Staff Directory)

• Brokered Query (e.g., Video Catalog)

• Query Resubmit (e.g., Alumni Directory)

Page 11: How We Incrementally Improved Search

Implementation: Blended Search

Spider HBS web content outside of HBS.EDU

• Harbus.org (student newspaper)

• Club and affiliated sites

Spider HBS content located in other applications

• Faculty and staff phone book

• Alumni Class Notes application

Page 12: How We Incrementally Improved Search

Implementation : Optimize and clean up search indexes

Work with content owners to create good HTML page titles

• Faculty Publications pages

• 20th Century Leadership database

• Address MS-Office / PDF files too

Eliminating duplicate search results / use filters

Adjusting Relevance per collection / source / file path

Page 13: How We Incrementally Improved Search

Implementation : Create Best Bets

Top 10 Queries

Oct – Dec

Page 14: How We Incrementally Improved Search

Implementation: Unify Blended Search + Query Resubmit

Page 15: How We Incrementally Improved Search

Query refinement options

(Blended Search)

Query resubmit options

“Integration-lite”

Page 16: How We Incrementally Improved Search

Implementation: Expanding the Net w/ Brokered Search

• When direct indexing isn’t practical

Harvard.edu search

HBS VideoTools (intranet only)

MBA Event Calendar (intranet only)

• A query is handed off to another search engine

• Results are returned “behind the scenes” as

JavaScript Object Notation (JSON) / Python

• Ajax-like support of asynchronous search

processes

Page 17: How We Incrementally Improved Search

Implementation: Brokered Query in Action

Page 18: How We Incrementally Improved Search

Implementation: Brokered Query in Action

Page 19: How We Incrementally Improved Search

Implementation: Brokered Query in Action

Page 20: How We Incrementally Improved Search

Implementation: One-offs

• Software Dev Docs (cmd line)

$ find ./software/docs –name ‘*html’

| xargs grep -i oracle | less

(returns 100s of docs)

• Built web-based search UI

Page 21: How We Incrementally Improved Search

Agenda

• Background

• Roadmap

• Implementation

Analytics

• Benefits

• Challenges

• Next Steps

Page 22: How We Incrementally Improved Search

Analytics: Tracking Usage of Features

Page 23: How We Incrementally Improved Search

Analytics: Tracking Best Bets

Page 24: How We Incrementally Improved Search

Analytics: Tracking Best Bets

Page 25: How We Incrementally Improved Search

Agenda

• Background

• Roadmap

• Implementation

• Analytics

Benefits

• Challenges

• Next Steps

Page 26: How We Incrementally Improved Search

Benefits

• Single point of access for various repositories

• Shortcomings of underlying tools overcome

• Better access to content from rest of Harvard

• Traffic boost to e-commerce site

Page 27: How We Incrementally Improved Search

Agenda

• Background

• Roadmap

• Implementation

• Analytics

• Benefits

Challenges

• Next Steps

Page 28: How We Incrementally Improved Search

Challenges

• Search is never done

• Complex permissions issues

• SERP design convergence

• SharePoint

Page 29: How We Incrementally Improved Search

Agenda

• Background

• Roadmap

• Implementation

• Analytics

• Benefits

• Challenges

Next Steps

Page 30: How We Incrementally Improved Search

Next Steps

• Tackling the mixed-mode situation

• Integration with taxonomies

• Search experience within HBS applications

• Faceted search where rich metadata

available

• Analytics feeding website design and

vocabulary development

Page 31: How We Incrementally Improved Search

Conclusion

• Tactical, iterative approach enabled

significant progress

• Implementing simpler features/tweaks may

have higher impact

• Your existing search engine may have more

gas in it than you realize