Upload
marco-brambilla
View
858
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Modeling Search Computing Applications
Alessandro Bozzon, Marco Brambilla, Alessandro Campi,Stefano Ceri, Francesco Corcoglioniti, Piero Fraternali, Salvatore Vadacca
ICWE 2010, Vienna
SeCo – Search Computing
Motivating Examples
“Who are the strongest candidates in Europe for competing on software ideas?”
“Who is the best doctor who can cure insomnia in a close-by hospital?”
“Where can I attend an interesting scientific conference in my field and at the same time relax on a beautiful beach nearby?”
This information is available on Internet, but no software system is capable of computing the answer.
Queries span over multiple semantic domains and require composing ranking of results.
SeCo – Search Computing
Their Common Aspect
Multi-domain queries
Individual answers are on the Web
A knowledgeable user would do the query step-by-step:– Search database conferences, get their city– Check that the city average temperature is warm enough– Search low-cost flights via a broker for that city– Search luxury hotels via another broker
We want a system for supporting this search process– Build several “solutions” which already integrate all dimensions– Rank “solutions” according to a rank function and outputing
results in rank order– Possibly add dimensions while the search proceeds or change
the relative weight of each search
SeCo – Search Computing
Search Computing architecture: overall view
Main Query flow
DomainRepository
Front End
Query Planner
Cache
Query To Domain Mapper
Cache
Query Analysis
Cache
Query Engine
OP 1 OP 2 OP N Cache...
WS-Framework
Cache
ServiceRepository
Result Transformation
Cache
WSWorld
High-Level Query
Sub-queries
ConcreteQuery Plan
Low-level queries Merged Results
DomainFramework
Cache
Final UserResults
<Uses> relation
High level query“Where can I attend a DB
scientific conference close to a beautiful beach reachable
with cheap flights?”Sub query 1“Where can I attend a DB scientific conference?”
Sub query 2“place close to
a beautiful beach?”
Sub query 3“place reachable with
cheap flight?”
Low level query 1ConfSearch(“DB”,placeX,dateY)Low level query 2
TourSearch(“Beach”,PlaceX)Low level query 3Flight(“cost<200”,PlaceX,DateY)
Query plan
Services invocations and operators execution
Results
Presented results
MSVVEIS’08 - Barcelona – IberiaLID’08 – Rome - AlitaliaRCIS’08- Marrakech- AirFrance
SeCo – Search Computing
Service Registration
Workshop sessions:
• Semantic Resource Framework
• Wrapping Technology and Ontological Annotation
• Search Computing and Research Evaluation
Service Marts:
• Conceptual representation of resources as entities and connections
• Logical representation of signatures
• Physical repre. as service implementations
SeCo – Search Computing
Query Processing
DomainRepository
Front End
Query Planner
Cache
Query To Domain Mapper
Cache
Query Analysis
Cache
Query Engine
OP 1 OP 2 OP N Cache...
WS-Framework
Cache
ServiceRepository
Result Transformation
Cache
WSWorld
High-Level Query
Sub-queries
ConcreteQuery Plan
Low-level queries Merged Results
DomainFramework
Cache
Final UserResults
Query Planner includes:
Language for querying services
Models for building (top-k vs top-flow) query plans
Methods for query optimization
Query Engine includes:
Panta Rhei, a query execution model.
Workshop sessions:
• Query Processing
• Rank Join
SeCo – Search Computing
Front-end Research in the SeCo framework
DomainRepository
Front End
Query Planner
Cache
Query To Domain Mapper
Cache
Query Analysis
Cache
Query Engine
OP 1 OP 2 OP N Cache...
WS-Framework
Cache
ServiceRepository
Result Transformation
Cache
WSWorld
High-Level Query
Sub-queries
ConcreteQuery Plan
Low-level queries Merged Results
DomainFramework
Cache
Final UserResults
Liquid Query
Client-side framework for configuration and automatic rendering of query and result interfaces
User interaction primitives that allow to perform explanatory search
Workshop sessions:
• Search as a Process
• Visual Interfaces for Complex Search
SeCo – Search Computing
Model Driven Development Process of SeCo Applications
Implement search service
Wrap or materialize service
Register service mart and interface
Service Mart model
Service developer
Service publisher
Design Query TemplateExpert user
Liquid Query model
Sea
rch
Ser
vice
D
evel
opm
ent
Ser
vice
A
dapt
atio
n an
d R
egis
trat
ion
App
licat
ion
Con
figur
atio
n
Refine Query PlanSeCo expert
Que
ry P
lan
Ref
inem
ent
Manual optimization needed?
N
Y
Query Plan model
SeCo – Search Computing
A Model-driven Perspective on Search Computing
MDE approaches applied to search computing1. metamodels describing the objects of interest,
• shared knowledge and vision • bases for future tool interoperability
2. specification of applications through model transformations• formalized representation of the intended semantics • tool interoperability
3. definition of a domain specific language (DSL) for query processing• Simplified definition and visual representation of the query
manipulation processes
SeCo – Search Computing
SeCo: MDE Overview
The SeCo system can be seen as a set of models and model transformations – At design time– At runtime (Query plan execution)
Query Model
Result Model
Service Mart Model
Designer ChoicesQueryToPlan
Query Plan ModelQuery Parameters
DESIGN TIME
RUN TIME
Conceptual level (CIM)
Logical level (PIM)
Physical level (PSM)
Conceptual level (CIM)
Logical level (PIM)
Physical level (PSM)
Designer Choices
SeCo – Search Computing
SeCo Overview: Models
4 artifact models – Service Mart, Query, Query Parameters, Result
A query plan model– For the runtime query transformation
11
Query Model
Result Model
Service Mart Model
Designer ChoicesQueryToPlan
Query Plan ModelQuery Parameters
DESIGN TIME
RUN TIME
Conceptual level (CIM)
Logical level (PIM)
Physical level (PSM)
Conceptual level (CIM)
Logical level (PIM)
Physical level (PSM)
Designer Choices
SeCo – Search Computing
Service Mart Metamodel 12
CompositionMarts
Interfaces
Patterns
ServiceMart
id: Integername: String[0..1]description: String[0..1]semantics: Semantics[0..1]domain: Semantics[0..1]
Attribute
id: Integername: String[0..1]description: String[0..1]semantics: Semantics[0..1]
ComposedAttribute
averageCardinality: Integer
AtomicAttribute
type: dataType
ConnectionPattern
id: Integername: String[0..1]description: String[0..1]
AttributeConstraint
operator: RelationalOperator
RankingType
id: Integername: String[0..1]description: String[0..1]semantics: Semantics[0..1]rankingFunction: Expression[0..1]rankingDirection: SortDirection
AccessPattern
id: Integername: String[0..1]description: String[0..1]
ServiceInterface
id: Integername: String[0..1]description: String[0..1]erspi: Floatcacheable: BooleancacheTimeToLive: Milliseconds[0..1]cost: Pricing[0..1]endpointURI: URI
SearchServiceInterface
decayFun: Expressionchunked: BooleanchunkSize: Integer[0..1]initTime: MillisecondsfetchTime: Milliseconds
ExactServiceInterface
fetchTime: Milliseconds
ServiceConnection
AttributeDirection<<enumeration>>
INOUTRANKING
1
0..*
1
1..*
1
0..*
10..*
10..*
1 0..*
1..*0..*
1 0..*
1 0..*
1
0..*
0..*
1
0..1
0..*Attribute
+dir: AttributeDirection
SeCo – Search Computing
Service Mart Metamodel
ServiceMart– A ServiceMart is an abstraction (e.g., Hotel) of one or more Web service
implementations (e.g., Bookings and Expedia)– capable of accepting queries and of returning results– possibly ranked and chunked into page
Attribute– ServiceMart contains Attributes– Attributes can be Atomic or Composite
AccessPattern– An AccessPattern specifies RankingType and AttributeDirection (I/O) for
every Attribute of the ServiceMart, thus allowing its actual invocation
ConnectionPattern– is defined as an input-output relationship between pairs of service marts
that can be exploited for joining them • e.g., the output city of the Concert can be used as input for the Hotel.
ServiceInterface– physical interface of the service, with details about chunk size, cost, …– Exact or Search (ranked)
13
SeCo – Search Computing
Query Metamodel
LogicalQuery– is a conjunctive query over services– can be defined at an abstract level (AccessPatternLevelQuery)
or at physical level (InterfaceLevelQuery).
14
QueryClause– a LogicalQuery is composed by a set of QueryClauses– a QueryClause can refer to the SM level or to the SI level – Several types of clauses
Service martsLogical queries
LogicalQuery
id: String
RankingClause
direction: SortDirectionweight: Float
J oinClause
InvocationClause
PredicateClause
condition: Expression
ServiceMart
ServiceInterface
ConnectionPattern
+serviceMart1
0..*
+selectedInterface0..10..*
+connectionPattern1
0..*
InterfaceLevelQuery
MartLevelQuery
QueryClause
id: String
+clauses
1 0..*
+source1 +target1
+rankedInvocation1
Attribute
+rankedAttribute1
SeCo – Search Computing
SeCo Overview: Transformations
1. Vertical transformations for Queries and ServiceMarts
2. QueryToPlan transformation
3. Query Execution transformation (at runtime)
4. Result transformation (at runtime)
15
Query Model
Result Model
Service Mart Model
Designer ChoicesQueryToPlan
Query Plan ModelQuery Parameters
DESIGN TIME
RUN TIME
Conceptual level (CIM)
Logical level (PIM)
Physical level (PSM)
Conceptual level (CIM)
Logical level (PIM)
Physical level (PSM)
Designer Choices
11
2
4
3
SeCo – Search Computing
Vertical transformations
For moving among different conceptualization levels
For providing recommendations
For transforming informations
Examples:– service mart and query: for moving from conceptual to logic to
phisical level– result:
• for reshaping the data in the resultset (exploratory approach implemented by liquid query)
• For enriching the results with personalization and recommendations
SeCo – Search Computing
Query Execution transformation
Query execution as a transformation– model of the query parameters -> model of the query results
Represented as a Query Plan model– well-defined scheduling of service invocations, possibly
parallelized, that complies with their service interface and exploits the ranking in which search services return results to rank the combined query results.
QueryPlan metamodel + Concrete Syntax = Panta Rhei Language
17
SeCo – Search Computing
Query plan metamodel
Execution plans
Service marts
ExecutionPlan
id:Integer
Edge
id: String
Node
id: StringmonitoredAttributes: String[0..*]
ControlFlow
type: ControlType
DataFlow
Output Modifier
filter: Expression
Sorter
criteria: SortCriteriablocking: Boolean
Chunker
chunkSize: Int[0..1]stop: Int[0..1]
J oiner
predicate: Expression
Service
alias: String
Input
ExactService
SearchService
AttributeBinding
binding: Expression
+edges 1
0..*+nodes10..*
+attributeBindings
1
0..*
+source+outgoingEdges 10..*
+target+ingoingEdges 10..*
Attribute
+attribute1
0..*
ServiceInterface
+serviceInterface 1
0..*
Selector
filter: Expression
PipeController
tilesPerFetch: Intstrategy: PipeStrategy
ParallelController
tilesPerFetch: Intstrategy: ParallelStrategy
SeCo – Search Computing
Transformations: Panta Rhei
Panta Rhei– describes both the execution flow and the data flow between nodes. – Several types of nodes exist
• service invocators, sorting, join, and chunk operators, clocks (defining the frequency of invocations), caches, and others.
The query result model is constructed stepwise, following the execution flow
19
SeCo – Search Computing
Transformations: Query to Plan (1/2) 1st phase: an ATL helper (functional program) encapsulates
the scheduling algorithm of the execution plan. – The function produces a representation of a partial order of the
clauses– Several very different scheduling algorithms can be used in this
phase, and the transformation structure allows to easily swap the preferred one, also at runtime
2nd phase generates the output Pantha Rhei query plan. In this phase the following mappings are assumed:– Invocation clauses become Service invocation nodes– Join clauses become parallel joins or pipe joins– The connections between the nodes are generated based on the
ordering calculated in the first phase.
A Higher Order Transformation (HOT) could be used to automatically modify the logic of the plan, based on domain-specific needs or insights
SeCo – Search Computing
User interaction metamodel
Implemented by the Liquid Query paradigm– See: http://demo.search-computing.org
LiquidQueryType
LiquidQuery Instance
ConcreteQuery
+ID: Integer+Name: String
AvailableOperation
+ID: Integer
Operation
+ID: Integer+Name: String
Parameter
+ID: Integer+Name. String+Type: String+DefaultValue: String
+QueryParameters
10..*
LiquidQueryInstance
+ID: Integer+TimeStamp: Time
+InstancedQuery1..*
1
OperationInstance
+TimeStamp: Time
LiquidResultSet
+ID: Integer+TimeStamp: Time
+QueryOperations10..*
0..* 1
+InstancedOperation0..*
1
+OperationParameters
1
0..*
+QueryResults 0..*
1
ParameterInstance
+ID: Integer+Value: String
+InstancedValues 0..*
1
+QueryParameterInstance
1 0..*
Filter
+FilterAttribute: String+Condition: String+DefaultValue: String
Expand
+ServiceMartName: String
Group
+AttributeName: String
LiquidResult
+ID: Integer
+ResultInstances1 0..*
+OperationParametersInstance
1 0..*
LogicalQuery
+ID: Integer
+QueryImplementation0..*
1
SeCo – Search Computing
Model Transformation Challenges
Specification of mappings for data extraction– Simple interface based on MT– e.g. using Model Weaving, Transformations by Example.
Transformations for building views of the results.– views and viewpoints on models– i.e. model transformations to filter or change the representation
of a given data set
Search process orchestration in light of model transformations. – the Pantha Rhei DSL can be seen as a model transformation. – formalization is needed to represent query plans as composition
of operations on models.
Search on query models. – Search within the domain of the queries themselves – Ex: most typical queries and their relationship to usage patterns
22
SeCo – Search Computing
Experiments and prototypes
Main SeCo concept models in ECORE
Implemented ATL transformation that generates the query plan from query and service mart definitions, using trivial strategies
Further works: implementing different optimization strategies, by adopting rule-based optimization (old concept in the DB field)
Prototypes available online:http://dbgroup.como.polimi.it/brambilla/SeCoMDA
SeCo – Search Computing
Conclusions Search Computing as integration of several interacting models,
Partition of the design space and responsibilities on the different roles and involved expertise, in a non-trivial way
Objective: is to replace programming with model driven development wherever possible, yielding to flexibility and efficiency.
A model transformation approach is a good tool for clarifying the problem and solution space
Probably not viable for actual implementation of the search system, because of performance /scalability issues
Current status of the project and state of the artrecorded in the book: Search Computing Challenges and Directions (Springer LNCS, vol. 5950, Ceri-Brambilla eds.)– Part 1: Visions by Ceri, Baeza-Yates, Weikum
– Part 2: Technology Watch – Part 3: Issues in Search Computing
SeCo – Search Computing
Thanks!
Questions?