Upload
sarah-brickett
View
216
Download
3
Tags:
Embed Size (px)
Citation preview
Microsoft Confidential
Many-To-Many dimensions relationships
Yennifer SantosPremier Field EngineerMicrosoft Corporation
We look at the world... with our own eyes ...
And no matter what our perspective is...
We understand that people play the most important role
People interact, communicate and relate each other every day ...
They relate in a one to many relationships ...
On a one to one relationships ...
And in a many-to-many relationships ...
With all the pros and cons ... for each one of them
So for the SSAS …... Cube perspective of the world ......
The way relationships are build…is extremelly important
Specially when we are talking about M2M relationships
Typical scenario
M2M scenario concepts
14
Data measure group
M2M dimension
Intermediate measure group
Intermediate dimension
M2M scenario
What’s up with M2M relationships in SSAS?
16
Query
By M2M Dim
IN MEMORY INNER JOIN
What’s up with M2M relationships in SSAS?
As the data size of the records in the join increasesThe performance of the run-time join in SSAS suffers
17
IN MEMORY INNER JOIN
Query performance with M2M
Query performance in SSAS is intrinsically and linearly tied to the size of the data being joined between
Data andIntermediate measure groups
18
What to do to improve query performance?
There are 3 optimization techniques:Defining aggregationsPartitioningMatrix relationship
19
Defining aggregations optimization technique
What is an aggregation?Pre-calculated summary of data that SSAS uses to enhance query performanceMoves the calculation to the processing phaseSummarizes measures by a combination of dimension attributes
20
1st
Defining aggregations optimization technique
Aggregations Work by reducing the number of records that the storage engine needs to scan from disk in order to satisfy a queryProvides a significant benefit only if the size of the aggregation is significantly smaller than the size of the original table
21
Defining aggregations optimization technique
Where to define aggregations in a M2M scenario?
22
Defining aggregations optimization technique
Data measure group aggregations for M2M queries
Include in the aggregation the granularity attribute of all dimensions that join with the intermediate measure groupDo not include the attribute from the M2M dimension (occurs at query time as part of the run-time join)
23
X
Defining aggregations optimization technique
BenefitsThe benefit is directly related to the size of this aggregation compared to the size of the data measure groupAn aggregation whose size is >=1/3 the size of the facts themselves is not considered to be useful
24
Defining aggregations optimization technique
Intermediate measure group aggregations for M2M queries
Include in the aggregation the granularity attribute of the dimensions in the intermediate measure group that relates to the data measure group The attribute in the dimension that you wish to aggregate
25
Defining aggregations optimization technique
BenefitsThe benefit is directly related to the resulting size of the aggregation compared to the size of intermediate measure group
26
Partitioning optimization technique
PartitioningEnables SSAS to retrieve data from only a portion of data in a measure groupParallelize data retrieval when data must be retrieved from multiple partitions
27
2nd
Partitioning optimization technique
Where to define partitions in a M2M scenario?
28
Partitioning the data measure group
Data measure groupThe data measure group should generally be partitioned by the same dimension attribute members as the intermediate measure group
29
Partitioning the intermediate measure group
Intermediate measure groupYou must relate the partitioning dimensions to the intermediate measure Benefit
Reduces the data size used by the run-time join between intermediate and data measure groupsWhen the M2M query can be resolved from only a few (or one) partitions
30
Partitioning the intermediate measure group
If the M2M query must be resolved by retrieving data from many or all partitions
The technique provides little or no value and may even increase response times
31
Partitioning the intermediate measure group
Design paternThe intermediate measure group must
Be partitioned by one or more common dimensions used in queriesBe related to each dimension that is used for partitioning
The intermediate fact table must Contain the dimension surrogate key
That is used for the measure group partitioningTo relate with the dimension
32
Matrix relationship optimization technique
Increases M2M query performance by reducing the size of the run-time join
Uses a process of compression to eliminate unnecessary repetitiveness in the intermediate fact tableCollapses the size of the intermediate fact table
33
3rd
Matrix relationship optimization technique
Two phases:Compression and creation of the matrix keyImplementing the matrix optimization
34
Compression and creation of the matrix key
Creates a compressed intermediate fact table by taking the following steps:
Identify common dimension member combinations in the intermediate fact table Each set of common dimension member combinations is assigned a surrogate key (matrix key)Repeated combinations are eliminated
35
Compression and creation of the matrix key
36
Implementing the matrix optimization
Requires changes to aspects of the M2M relationship at the following levels:
The relational data warehouse implementation levelThe cube design levelThe ETL implementation level
37
Relational Datawarehouse
38
Relational Datawarehouse
Matrix dimension table creation
39
Relational Datawarehouse
Requires an intermediate fact table that relates the M2M dimensions to the base fact through the matrix key
Will have the matrix key column and one or more dimension key columns
40
Relational Datawarehouse
Add the Matrix Key column to the base fact table
41
Cube design
Works in the same way that any M2M dimension relationshipThe difference is
Shared dimension between the base measure group and the intermediate measure group is the matrix dimension, rather than the fact dimension
42
Cube design
In Data Source View, add two tables for
Matrix dimension New intermediate fact table
43
Cube design
Add a new SSAS database dimension based on the matrix dimension tableAdd this dimension to the cube that contains the base measure group
44
Cube design
Add a new SSAS measure group to act as the intermediate measure group for resolving the M2M relationships Remove the existing
Sales Reasons measure group
45
Cube design
Define the Dimension Usage to associate the dimension tables to the measure groups
Both the base and intermediate measure group must have a direct relationship to the matrix dimension
46
Cube design
The intermediate measure group also needs a direct relationship to the dimension or dimensions that participate in the M2M relationshipThe dimensions referenced in the second point must also be related to the base measure group
47
ETL implementation level
The ETL involvesUpdating the relational data warehouse tables with new matrix relationships Associating the fact tables to the matrix dimensions
48
Matrix relationship
The general guidelines are:Adding new key concatenation strings to the matrix dimension table for new records being added to the fact tableAdding the matrix dimension key associated with each transaction ID to the base fact tableAdding the key of the matrix dimension with each M2M dimension key record
49
Questions and Answers
50