Upload
edith
View
47
Download
3
Embed Size (px)
DESCRIPTION
R e a l - T i m e S e a r c h E n g i n e. Network software system laboratory. Rana Shahout & Ibrahim Baransi supervisor : Edward Bortnikov Winter 2011. Agenda. The problem & motivation Background in search systems The architecture CIP policies Software design. What?. - PowerPoint PPT Presentation
Citation preview
Network software system laboratory
Rana Shahout & Ibrahim Baransi supervisor : Edward Bortnikov
Winter 2011
Real-Time Search Engine
Agenda
• The problem & motivation • Background in search systems • The architecture• CIP policies• Software design
What?
What is the project goal? Serving fresh search results when the data is constantly changing
Nowadays websites changes in a high frequency, such as Twitter, Facebook, news .
Background in search systems
Search cachesWhy is that a problem ?Search engine uses cache optimization which makes the search engine faster and efficient, when the data a dynamic data, some of cache’s information become irrelevant.
Search engines search for the queries first in the cache, and only if there is cache miss they search in the Index.
Thus, when the data is dynamic, it is existing in the cache, and the search engine returns UNCORRECT result
General picture
Why?
The Architecture
Data structures required for implementation
Index- Lucene Index Directory :Lucene is a free text-indexing and -searching API written in Java, a typical Lucene index is stored in a single directory in the file system on a hard disk
Cache-
It was implemented as a linked-list with hash table.
Replacement policy is LRU
CIP-- CACHE INVALIDATION PREDICTORS
The CIP is formed of two major parts:Synopsis generator is responsible for preparing synopses of the new documents coming in .
Invalidator interacts with the runtime system and decides which cached entries to invalidate according to two policies.
Invalidation Policies
•Basic: invalidates each query (in the cache) which appear in the synopsis.
•Score:Find out all the queries (in the cache) which are contained in the synopsis, for each one of them compute score(q,d)- where d is the added/updated document – and invalidate top K results.
Illustration
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
President Barak Obama meets Mubarak in London
Added Document
Basic Invalidation
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
President Barak Obama meets Mubarak in London
Added Document
Basic Invalidation
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
CIP Will help here !
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
President Barak Obama meets Mubarak in London
Added Document
Basic Invalidation
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
My work is done
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Basic Invalidation
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document
Score Invalidation- K=1
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document-d
Score(q,d) Query0.56 President Obama
0.32 President Mubarak
0.001 Barak Obama
Score Invalidation- K=1
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document-d
Score(q,d) Query0.56 President Obama
0.32 President Mubarak
0.001 Barak Obama
Score Invalidation- K=1
President Barak Obama meets Mubarak in London
Value KeyPresident Mubarak, Egypt Mubarak Mubarak
President Obama, Barak Obama Obama
Facebook features, Facebook account
Cache
Added Document-d
Score Invalidation- K=1
President Barak Obama meets Mubarak in London
Software Design – UML Diagrams
Search Query, with miss in cache
Software Design – UML Diagrams
Add a document to index with basic invalidation
Skills
We acquired the following skills in this project: • Knowledge: reading scientific publications • Java (& Advanced Java topics)• Working with Web-server.(apache)• Learning Lucene features and how to use it.• Building software Cache. • UML• XML parsing• HTML