14
1 Unit 10. Sorter, Aggregator and Self-Join Unit 10

PowerCenter Level1 Unit10

Embed Size (px)

DESCRIPTION

Presentation of PowerCenter Level 1 Labs Unit10

Citation preview

1

Unit 10. Sorter, Aggregator and Self-Join

Unit 10

2

Unit Objectives

• Understand why and how to use:• Sorter transformations

• Aggregator transformations

• Self-Joins

• Use these features in a mapping

Unit 10

3

Sorter Transformation

Sorts data on one or more ports

Active

Ports• Input/Output• Define one or more

sort keys• Define sort order for

each key

Example of Usage• Sort data before

Aggregator to improve performance

Sort Keys

Sort Order

Unit 10

4

Sorter Transformation Properties

Unit 10

5

Aggregator Transformation

Active

Ports• Mixed I/O ports allowed • Variable ports allowed• Group By allowed

Create expressions in variable and output ports

Usage• Standard aggregations

Performs aggregate calculations

Unit 10

6

Aggregator Properties

Sorted InputIndicates input data is presorted by groups. Select this option only if the mapping passes sorted data to the Aggregator transformation.

Aggregator Data Cache Size

Data cache size for the transformation. Default is “Auto.”- Non group by input ports used in non-aggregate output expression. - Non group by input/output ports. - Local variable ports. - Port containing aggregate function (multiply by three).

Aggregator Index Cache Size

Index cache size for the transformation. Default is “Auto”. Index cache contains all group by ports.

Unit 10

7

Aggregate Expressions

Conditional Aggregate expressions are supported: Conditional SUM format: SUM(value, condition)

Aggregate functions

AVGCOUNT FIRSTLAST MAXMEDIANMIN PERCENTILESTDDEV SUM VARIANCE

Unit 10

8

Passive and Active Transformations Review

• Passive transformations• Same number of rows come out as

went in.

• Examples: Expression and Lookup transformations

Active transformations• Can change the number of rows

(i.e., combine or drop rows).

• Examples: Aggregator, Filter, Joiner transformations

Unit 10

9

Data Concatenation

• Brings together different pieces of the same record

• Works only if:

• Combining branches of the same source pipeline

• AND neither branch contains an active transformation

DISALLOWED

TT

Active

ALLOWED

T

Passive

T

Unit 10

10

Self-Join

• Join Master and Detail records arising from the same source or transformation

• For self-joins between two branches of the same pipeline

• Must add a transformation between the Source Qualifier and the Joiner in at least one branch of the pipeline

• Data must be pre-sorted by the join key

• Configure the Joiner transformation for sorted input

• For self-joins between records from the same source

• Create two instances of the source and join the pipelines from each source

Unit 10

11

Self-Join Example

Unit 10

12

Unit 10 Lab Reload Employee Staging Table

employees_central.txt,employees_east.txt,employees_west.txt(flat files)

STG_EMPLOYEES(table)

salaries.txt

Lookup

Modify a mapping to add dealership manager using:• Sorter transformation to split the data stream• Self-Join to concatenate the data streams

Create/run a workflow for this mapping

StagingAreaSources

Unit 10

13

Lab Review

• What did we accomplish with this lab?

• Questions?

Unit 10

14

Unit 10 Quiz

1. When would you use:1. Sorter transformation?a. Aggregator transformation?b. Self-Join?

2. What is the difference between Active and Passive transformations?

3. What are the rules on data concatenation?

4. What are the rules on using a Self-Join?

Unit 10