Upload
ravi-mynampaty
View
296
Download
0
Tags:
Embed Size (px)
DESCRIPTION
An oldie but a goldie. Some overlap with a diff preso but this is a standalone.
Citation preview
How We Incrementally Improved
Search
Ravi Mynampaty @ravimynampaty
Agenda
Background
• Roadmap
• Implementation
• Analytics
• Benefits
• Challenges
• Next Steps
Background: A few years ago…
• Out-of-the-box Ultraseek
• No optimization, no customization
• Fraction of HBS content indexed / searchable
• Many dead ends
• Proliferation of different search tools
• User sentiment
• “search sucks”
• “why can’t it be more like Google”
Background: Our Vision
• One Search Box to Rule Them All
• The long term goal: enterprise search
• One-stop searching
• Google-like simplicity
• Handle refinement / navigation on results page
Agenda
• Background
Roadmap
• Implementation
• Analytics
• Benefits
• Challenges
• Next Steps
Roadmap: Preliminary Steps
• Inventory document collections
• Inventory search-type tools
• Of the above, identify
– most heavily used
– strategically significant
– high impact
– Low Hanging Fruit (LHF)
Roadmap: Implementation Plan
• Prioritize tasks by ease of content access
and implementation (LHF)
• Develop timeline
• Build prototypes and iterate the design
Agenda
• Background
• Roadmap
Implementation
• Analytics
• Benefits
• Challenges
• Next Steps
Implementation: How we built it
• Customized Ultraseek’s results display code
• Worked with owners of software apps
–Provided JSON APIs
–Allowed us to spider their app/repository
• HTML is the API !!
• In other words:
No rocket science involved
Implementation: Three Integration Approaches
• Blended Search (e.g., Faculty/Staff Directory)
• Brokered Query (e.g., Video Catalog)
• Query Resubmit (e.g., Alumni Directory)
Implementation: Blended Search
Spider HBS web content outside of HBS.EDU
• Harbus.org (student newspaper)
• Club and affiliated sites
Spider HBS content located in other applications
• Faculty and staff phone book
• Alumni Class Notes application
Implementation : Optimize and clean up search indexes
Work with content owners to create good HTML page titles
• Faculty Publications pages
• 20th Century Leadership database
• Address MS-Office / PDF files too
Eliminating duplicate search results / use filters
Adjusting Relevance per collection / source / file path
Implementation : Create Best Bets
Top 10 Queries
Oct – Dec
Implementation: Unify Blended Search + Query Resubmit
Query refinement options
(Blended Search)
Query resubmit options
“Integration-lite”
Implementation: Expanding the Net w/ Brokered Search
• When direct indexing isn’t practical
Harvard.edu search
HBS VideoTools (intranet only)
MBA Event Calendar (intranet only)
• A query is handed off to another search engine
• Results are returned “behind the scenes” as
JavaScript Object Notation (JSON) / Python
• Ajax-like support of asynchronous search
processes
Implementation: Brokered Query in Action
Implementation: Brokered Query in Action
Implementation: Brokered Query in Action
Implementation: One-offs
• Software Dev Docs (cmd line)
$ find ./software/docs –name ‘*html’
| xargs grep -i oracle | less
(returns 100s of docs)
• Built web-based search UI
Agenda
• Background
• Roadmap
• Implementation
Analytics
• Benefits
• Challenges
• Next Steps
Analytics: Tracking Usage of Features
Analytics: Tracking Best Bets
Analytics: Tracking Best Bets
Agenda
• Background
• Roadmap
• Implementation
• Analytics
Benefits
• Challenges
• Next Steps
Benefits
• Single point of access for various repositories
• Shortcomings of underlying tools overcome
• Better access to content from rest of Harvard
• Traffic boost to e-commerce site
Agenda
• Background
• Roadmap
• Implementation
• Analytics
• Benefits
Challenges
• Next Steps
Challenges
• Search is never done
• Complex permissions issues
• SERP design convergence
• SharePoint
Agenda
• Background
• Roadmap
• Implementation
• Analytics
• Benefits
• Challenges
Next Steps
Next Steps
• Tackling the mixed-mode situation
• Integration with taxonomies
• Search experience within HBS applications
• Faceted search where rich metadata
available
• Analytics feeding website design and
vocabulary development
Conclusion
• Tactical, iterative approach enabled
significant progress
• Implementing simpler features/tweaks may
have higher impact
• Your existing search engine may have more
gas in it than you realize