33
Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Embed Size (px)

Citation preview

Page 1: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Gregory GrefenstetteExalead

Exalead S.A. © 2009

Search-Based Applications:the Maturation of Search

Page 2: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Maturation of Search

2

Page 3: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

3

www.exalead.com/search 8 billion URLS, 2 billion images, 200 million videosWikipedia, cloud tags also Labs.exalead.com

Page 4: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Two ways to find information

44

DATABASESDATABASES

SEARCH ENGINESSEARCH ENGINESVSVS

Page 5: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Recent Past

5

SEARCH ENGINESSEARCH ENGINESDATABASESDATABASES

• Structured Data

• Transaction• Precise• All tuples

• SQL• Slow

• Structured Data

• Transaction• Precise• All tuples

• SQL• Slow

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

Page 6: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

More Recent

6

DATABASESDATABASES

• Structured Data

• Transactions

• Precise• All tuples

• SQL• Slow

• Structured Data

• Transactions

• Precise• All tuples

• SQL• Slow

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

• Top-K• Column store• Map Reduce• Data Cube

• Top-K• Column store• Map Reduce• Data Cube

• Connectors• Facets• Map Reduce• Tables

• Connectors• Facets• Map Reduce• Tables

SEARCH ENGINESSEARCH ENGINES

Page 7: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

NOW

DATABASESDATABASES SEARCH BASED SEARCH BASED APPLICATIONSAPPLICATIONSSEARCH BASED SEARCH BASED APPLICATIONSAPPLICATIONS SEARCH ENGINESSEARCH ENGINES

Page 8: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Search based Application

An application which uses a search engine component, but whose final purpose is not searching for a document, but rather a domain-oriented process result

– Examples: • Custom response management• Logistic tracking and tracing• Contextual Advertising• Database reporting after offloading

8

Page 9: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Databases are the backbone of search in information systemsDatabases are the backbone of search in information systems

Current situation

Front-officeusers

DatabaseDatabase

DataDataWarehouseWarehouse

DataMartDataMart

BIreports

Businessprocesses

Page 10: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Search-enabled applicationOptimized solution for information accessOptimized solution for information access

DatabaseDatabase

DataDataWarehouseWarehouse

SearchSearchEngineEngine

Front-officeusers

BIreports

Businessprocesses

Page 11: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Drawbacks of Using

Database Search

As aComponent

Page 12: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

12

Search Based ArchitectureSearch Based ArchitectureStandard ArchitectureStandard Architecture

Page 13: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

How does a Search Based Application work?

14

Page 14: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

• Business items are concrete objects directly understandable by end-users– Product, Customer, Purchase order, Technical support call

• Each business item becomes a document• Straightforward and simple format of the document index

allows performance and ease-of-use• Search engine can offer rich and powerful query language that

allows to make queries as complex and advanced as SQL despite the flat data model

• Search Engine must support – typed fields, intra field scope search, category/facets

15

Database converted to Business ItemsDatabase converted to Business ItemsStored as structured documentsStored as structured documents

Page 15: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Product_ID Product_Name Manufacturer_Names

123 control switch ACME Inc ; The Control Switch Company; Karl GmbH

124 red warning light …

Database into structured documentsDatabase into structured documents

Page 16: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Scope Search

Product_ID Product_Name

123 control switch

124 red warning light

Product_ID Manufacturer_ID

123 345

123 8574

123 4483

Manufacturer_ID Manufacturer_NAME

345 ACME Inc.

8574 The Control Switch Company

4483 Karl GmbH

Product_ID Product_Name Manufacturer_Names

123 control switch ACME Inc ; The Control Switch Company; Karl GmbH

124 red warning light …

… but the manufacturer namescan still be searched as individual records with scope search "ACME GmbH"does not match the document here)

Page 17: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Hierarchical categories

18

Product_ID Color Brand Fragile Nb of wheels

Wheel type

123 Red ACME Y 3 2

Product_ID Country

123 France

123 UK

123 Germany

Product_ID Attributes

123 Color/Red ; Brand/ACME ; Fragile/Y ; Nb_wheels/3 ; Wheel_type/2;Country/France ; Country/UK; Country/Germany

124 …

Multiple kinds of attributes can be mixed in a same category field. The hierarchical tree structure of

the categories preserves the differences between attribute

types

Multiple kinds of attributes can be mixed in a same category field. The hierarchical tree structure of

the categories preserves the differences between attribute

types

Multi-valued attributes can also be represented by categories. A single category field can be used

to store hundreds or thousands of attribute columns.

Multi-valued attributes can also be represented by categories. A single category field can be used

to store hundreds or thousands of attribute columns.

Page 18: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Multi-dimensional facets

19

Page 19: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Multi-dimensional facets• Search results facets provide aggregate values computed on-

the-fly with the search results list– One single search query can return the equivalent of dozens of

“GROUP BY” SQL clauses– Numerical values associated with facets (count, score, …) can be used

to perform complex computations on the results list

20

• Search performance is not affected by the size of the category tree– Thousands of attribute types can be represented by categories– Facets are dynamically selected by the search results: the displayed

attributes are always consistent with the search query (e.g. “color” and “engine type” when searching for a car, “screen size” and “CPU speed” when searching for a laptop)

Page 20: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

CASE STUDY

LOGISTICS TRACK & TRACE

21

Page 21: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Gefco overview• A subsidiary of French car maker PSA (Peugeot, Citroën)

– Now does most of its business outside of PSA• Logistics operator

– Carries cars from factories to dealers (road, rail)– Carries freight (parcels ; originally spare parts)– Supply chain and logistic platform design

• 3.5B€, 10 000 employees, 100 countries

Page 22: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

The original pain

• Classical multi-criteria search over Oracle, 2 million rows• Poor performance despite 2 years of optimization

– Minute response times– Ask users to do simple queries and preferably at some given hours

Page 23: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

From forms to a search box

24

Page 24: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

25

Page 25: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

New application With operational reporting

Page 26: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search
Page 27: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

French Post Office

28

Partner

Page 28: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

• Tracing of incidents• Real-time system• Used as an internal

audit tool for the mail• Suggestion of addresses

for customers• Search in file numbers,

addresses, names, etc.

• Tracing of incidents• Real-time system• Used as an internal

audit tool for the mail• Suggestion of addresses

for customers• Search in file numbers,

addresses, names, etc.

Page 29: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Case Study: RightMove

31

Page 30: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Rightmove: Reduce Costs and Improve Performance through Database

32

Page 31: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Advantages of Search Based Applications

33

Page 32: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

35

Page 33: Gregory Grefenstette Exalead Exalead S.A. © 2009 Search-Based Applications: the Maturation of Search

Conclusions• Search engines mature

– Structured data, high volume, high speed• Search based Applications offer

– Usage: Search interface familiar to user– Performance: Search engine geared to search,

eases load on database platform– Agility: Original database design untouched,

reconfiguring output lightweight

36