53

PowerCenter Developer: Tips & Tricks for Mapping Designer

Embed Size (px)

DESCRIPTION

Agenda•Introduction•Architecture Best Practices•Mapping Tips & Tricks•Transformation Techniques•Use of Metadata•Repository maintenance•Q&A

Citation preview

Page 1: PowerCenter Developer: Tips & Tricks for Mapping Designer
Page 2: PowerCenter Developer: Tips & Tricks for Mapping Designer

2

PowerCenter Developer: Tips & Tricks for Mapping Designer

Lingaraju Ramasamy (Raju),

Technical Architecture Manager

Informatica Professional Services

Page 3: PowerCenter Developer: Tips & Tricks for Mapping Designer

3

Agenda

• Introduction

• Architecture Best Practices

• Mapping Tips & Tricks

• Transformation Techniques

• Use of Metadata

• Repository maintenance

• Q&A

Page 4: PowerCenter Developer: Tips & Tricks for Mapping Designer

4

Introduction

Page 5: PowerCenter Developer: Tips & Tricks for Mapping Designer

5

Presenter Contact

• Lingaraju Ramasamy (Raju)

[email protected]

• 408-368-2475 (Mobile)

• Technical Architecture Manager, Informatica

Professional Services

Page 6: PowerCenter Developer: Tips & Tricks for Mapping Designer

6

Architecture Best Practices

Page 7: PowerCenter Developer: Tips & Tricks for Mapping Designer

7

Architecture Best Practices

• Consistency • Applying consistent standards reduces long term complications

• Naming Conventions (Velocity)

• Descriptions

• Environments

• Documentation (Hyperlink to SharePoint)

• Modularity • Develop according to a modular design

• Common Error Handling

• Reprocessing

• Mapping Assistants

• Reusability • Focus on reuse to make quick and universal modifications

• Mapplets, Worklets, Transformations, reusable functions

Page 8: PowerCenter Developer: Tips & Tricks for Mapping Designer

8

Architecture Best Practices

• Scalability • Keep volumes in mind in order to create efficient mappings

• Caching

• Queries

• Partitioning

• Initial vs. Incremental Loads

• Simplicity • Multiple simple processes are often better than few complex

processes

• Multiple mappings

• Simple Queries

• Staging Tables

• Advantages

• Easy to develop, debug, maintain and debug

Page 9: PowerCenter Developer: Tips & Tricks for Mapping Designer

9

Sample Complex Mapping

EXP_SEQ_HEA

DER_HIERARC

HY_RECS11

EXP_SEQ_DET

AIL_HIERARCH

Y_RECS

SQ_SC_T_STR

_ATTR_OUTL_

ORG_WK

SC_T_STR_AT

TR_OUTL_LAN

G_FN (Oracle)

JAVA_GENERA

TE_MSGID

JAVA_GENERA

TE_SESSID

WSC_STR_OU

Tl_SV_OUTL

JNR_OUTL_HR

_HOL

JNR_OUTL_HR

S

UNI_OUTLET

SEQ_HR_HDR

SC_T_STR_AT

TR_OUTL_HRS

_WK1 (Oracle)

SC_T_STR_AT

TR_OUTL_TM_

FN (Oracle)

SC_T_STR_AT

TR_OUTL_ORG

_WK (Oracle)

SC_T_STR_AT

TR_OUTL_HRS

_WK (Oracle)

SC_T_STR_AT

TR_OUTL_HOL

_FN (Oracle)

SC_T_STR_AT

TR_OUTL_FN1

(Oracle)

SEQ_HOL

TC_TRANSACT

ION_RECS

RTR_HDR_DE

T_DATA

EXP_HDR_BOO

KEND

SEQ_PK_FK_O

UTLET

EXP_ACCOUNT

ING_REC_6

EXP_SEQ_STA

TUS_REC_5

EXP_SEQ_IDE

NTIFICATION_

REC_4

EXP_SEQ_DET

AIL_HOLIDAY_

RECS

EXP_HRS_BOO

KEND

EXP_SEQ_REC

_TM_ZONE

SEQ_HRS

EXP_SEQ_DAY

_OF_WEEK

EXP_LANG_BO

OKEND

JNR_OUTL_LA

NG_HOL_HRS

EXP_SEQ_HEA

DER_HOLIDAY

_RECS

EXP_SEQ_HEA

DER_TEAM_M

EMBER_RECS

EXP_SEQ_DET

AIL_TEAM_ME

MBER_RECS

SQ_TEAM_ME

MBER

EXP_TEAM_ME

MBER

SEQ_TEAM_ME

MBER

JNR_TEAM_ME

MBER

JNR_HRS_HDR

EXP_HRS_HDR

SQ_SC_T_STR

_ATTR_OUTL_

FN1

EXP_OUTL_BO

OKEND

SQ_SC_T_STR

_ATTR_OUTL_

HRS_WK

SEQ_ORG

JNR_ORGANIZ

ATION

SEQ_LANG

EXP_HOL_BOO

KEND

EXP_SEQ_SER

VICE_REC_3

EXP_HEADER_

HOURS

EXP_CREATE_

HDR_ELEMEN

TS

EXP_SEQ_REC

_GEO_TYPE

SQ_SC_T_STR

_ATTR_OUTL_

HOL_FN

EXP_SEQ_REC

_BASIC

SEQ_OTHERS

EXP_DETAIL_H

OURS

SRT

SQ_SC_T_STR

_ATTR_OUTL_

LANG_FN

SQ_SC_T_STR

_ATTR_OUTL_

HRS_WK1

EXP_CHK_NE

W_RECS

EXP_OUTL_OR

G_BOOKEND

HOL_LEVEL3 (F

lat File)

SEQ_OUTLET

EXP_PASS_TH

ROUGH

Multiple Sources –> WebServices

Page 10: PowerCenter Developer: Tips & Tricks for Mapping Designer

10

Simplified Complex Mapping

RTR_HDR_DE

T_DATA

EXP_DETAIL_L

ANG

EXP_CHK_NE

W_RECS

SEQ_OUTLET

LANG_DETAIL_

STR_ATTR_OU

TL_WK (Oracle)

LANG_HEADER

_SC_T_STR_A

TTR_OUTL_W

K (Oracle)

SQ_SC_T_STR

_ATTR_OUTL_

CHG_WK

SC_T_STR_AT

TR_OUTL_CHG

_WK (Oracle)

SC_T_STR_AT

TR_OUTL_LAN

G_FN (Oracle)

SC_LKP_GET_

MSGID

SC_EXP_CREA

TE_HDR_ELEM

ENTS

SRT_INCM_RE

CS

SEQ_LANG

SQ_SC_T_STR

_ATTR_OUTL_

LANG_FN

EXP_HEADER_

LANG

SEQ_OTHERS

EXP_PASS_TH

ROUGH

EXP_OUTL_BO

OKEND

JNR_OUTL_LA

NG_HOL_HRS

EXP_LANG_BO

OKEND

SQ_SC_T_STR

_ATTR_OUTL_

HRS_WK

SC_LKP_GET_

MSGID

SC_T_STR_AT

TR_OUTL_HRS

_WK (Oracle)

SC_EXP_CREA

TE_HDR_ELEM

ENTS

DETAIL_HOUR

S_SC_T_STR_

ATTR_OUTL_

WK1 (Oracle)

HOURS_HDR_

SC_T_STR_AT

TR_OUTL_WK (

Oracle)

EXP_SEQ_DAY

_OF_WEEK

EXP_SRC_BOO

KEND

SEQ_HRS

SQ_SC_T_STR

_ATTR_OUTL_

CHG_WK

SC_T_STR_AT

TR_OUTL_CHG

_WK1 (Oracle)

EXP_HRS_BOO

KEND

RTR_HDR_DE

T_DATA

EXP_CHK_NE

W_RECS

DAY_OF_WEEK

_SC_T_STR_A

TTR_OUTL_W

K2 (Oracle)

SRT_INCM_RE

CS

EXP_DETAIL_H

OURS

EXP_HEADER_

HOURS

JNR_OUTL_HR

S

SEQ_OTHERS

EXP_PASS_TH

ROUGH

SEQ_OUTLET

SC_T_STR_AT

TR_OUTL_CHG

_WK (Oracle)

EXP_CHK_NE

W_RECS

EXP_LANG_BO

OKEND

SC_T_STR_AT

TR_OUTL_LAN

G_FN (Oracle)

SEQ_OUTLET

LANG_DETAIL_

STR_ATTR_OU

TL_WK (Oracle)

RTR_HDR_DE

T_DATA

EXP_DETAIL_L

ANG

SQ_SC_T_STR

_ATTR_OUTL_

CHG_WK

SC_LKP_GET_

MSGID

SC_EXP_CREA

TE_HDR_ELEM

ENTS

SRT_INCM_RE

CS

LANG_HEADER

_SC_T_STR_A

TTR_OUTL_W

K (Oracle)

SEQ_LANG

SQ_SC_T_STR

_ATTR_OUTL_

LANG_FN

EXP_HEADER_

LANG

SEQ_OTHERS

EXP_PASS_TH

ROUGH

EXP_OUTL_BO

OKEND

JNR_OUTL_LA

NG_HOL_HRS

Staging 1 Staging 2

Staging 3 Staging –> WebServices

JAVA_GEN_MS

G_ID

SEQ_ID EXP_GET_SEQ

_NUM

SQ_SC_T_STR

_ATTR_SITE_F

N

WSC_STR_ATT

R_SAVE_SITE

EXP_SRC_BOO

KEND

JAVA_GEN_SE

SSID

SC_T_STR_AT

TR_SITE_FN1 (

Oracle)

SC_T_STR_AT

TR_SITE_FN (O

racle)

Page 11: PowerCenter Developer: Tips & Tricks for Mapping Designer

11

Mapping Tips & Tricks

Page 12: PowerCenter Developer: Tips & Tricks for Mapping Designer

12

Mapping Tips

• Sources and Targets

• Use shortcuts from shared folders

• Extract only what is necessary

• Limit reads on source

• Distinguish between similar sources and targets

• Example

• DIM_CUSTOMER1 = DIM_CUSTOMER_insert

• DIM_CUSTOMER2 = DIM_CUSTOMER_update

Page 13: PowerCenter Developer: Tips & Tricks for Mapping Designer

13

Mapping Tricks

Parameters & Variables • Reduce overhead of creating multiple mappings

• Replace hard coded values

• Use to incrementally extract data

Example

UpdateDateTime >= TO_DATE (‘$$PREV_RUN_TS’)

: SetVariable (‘$$CURR_RUN_TS, SESSSTARTTIME)

Page 14: PowerCenter Developer: Tips & Tricks for Mapping Designer

14

Mapping Tricks

Parameters & Variables

• Assign Parameter/Variable values in a Session

• Pass values from one session to a subsequent session in same workflow/worklet

• On Components Tab in Session Properties

• Use user-defined workflow/worklet variables

• Non-reusable Sessions only

Page 15: PowerCenter Developer: Tips & Tricks for Mapping Designer

15

Mapping Tricks

Built-in Mapping Variables

• Mapping Name

• Workflow Name

• Session Name

• Integration Service Name

• Repository Service Name

• Repository User Name

• Folder Name

• Session Run Mode

• Source Table Names

• Target Table Names

Page 16: PowerCenter Developer: Tips & Tricks for Mapping Designer

16

Mapping Tricks

Group Expression (Anchor transformation)

• Add expression transformation after a source qualifier and

before a target

• If source or target definition changes, reconnecting ports is

much easier

Page 17: PowerCenter Developer: Tips & Tricks for Mapping Designer

17

Mass Update

• pmrep massupdate

• Session properties

• Session config attributes

• Transformation instance attributes

• Session instance run time options

Page 18: PowerCenter Developer: Tips & Tricks for Mapping Designer

18

Mapping Assistants

Preview Data

• View Data

• Accommodate anomalies early

• Verification of extraction/loading strategies

• Type of Data

• Source/Targets

• Relational, Flat file

• XML Files

• For further analysis, use Informatica Analyst

• Analyze the content, quality and structure of source data

• Involves separate Profiling warehouse, client and reports

Page 19: PowerCenter Developer: Tips & Tricks for Mapping Designer

19

Mapping Assistants

Mapping Wizard • Pass-Through

• Slowly Changing Dimension

• Type 1 Dimension (No History)

• Type 2 Dimension (All History)

• Version Data

• Flag Current

• Effective Date Range

• Type 3 Dimension (Previous Versions)

Slowly Changing Dimension Template

Page 20: PowerCenter Developer: Tips & Tricks for Mapping Designer

20

Mapping Assistants

• Standardize specifications

• Enhance collaboration between analyst and developer

• Improve documentation & audit ability of business logic

Mapping Analyst for Excel (MAE)

Data Analyst

Defines Business Terms

Specifies Transformation Rules

Standardize Excel format

DI Developer

Augments, Tunes

Generated Mappings

from Specifications

Generate Mapping

Generate Specification

Page 21: PowerCenter Developer: Tips & Tricks for Mapping Designer

21

Mapping Assistants

• Define consistent methodology & structure for data integration projects

• Build custom wizard based on pattern without coding

• Generate multiple mappings at one time

• Document data flow

Mapping Architect for Visio (MAV)

DI Architect

Creates & Publishes

mapping template

DI Developer

Augments, Tunes

Generated Mappings

Generate Mappings

Informatica

Toolbar

Informatica

Stencil Drawing

Window

Template File

Parameter File

Publish Template

Page 22: PowerCenter Developer: Tips & Tricks for Mapping Designer

22

Mapping Assistants

Mapping Architect for Visio (MAV)

Case Study #1

•7 templates were used across 2 projects to generate 600 mappings

•97% of mappings were automatically generated and required no additional

changes

•3% needed to be manually modified or custom developed

Case Study #2

•1 template was used to create 150 mappings for a data migration project

along with PowerCenter sessions and workflows

•Total effort was less than one day

•Equivalent effort to create 150 mappings manually would have been 2

weeks (10x effort)

Page 23: PowerCenter Developer: Tips & Tricks for Mapping Designer

23

Transformation Techniques

Page 24: PowerCenter Developer: Tips & Tricks for Mapping Designer

24

• Apply Default Query when possible • Utilize SQ Attributes

(i.e., User Defined Join, Source Filter)

• Understand advantages and limitations of the SQL override PROS

• Utilize database optimizers (i.e., indexes, hints)

• Can accommodate complex queries

CONS

• Processing impacts database resources

• Lose transformation logic in metadata searched

• Unable to utilize Partitioning or Pushdown Optimization options

• Minimize complicated queries

• Add the SQL Override Query to the Description

Source Qualifier

Transformation Tips

Page 25: PowerCenter Developer: Tips & Tricks for Mapping Designer

25

• Understand Port process order • INs or IN/OUTs

• VARIABLEs

• OUTPUTs

• Reduce code complexity • Use local variables

• Redundant calculations

• Check previous record

• Provide comments (-- or //) in expressions

• Optimize Expressions • Numeric operations are faster than string operations

• Operators are faster than functions

• Un-Nest complicated logic (use IIF or DECODE)

Expressions

Transformation Tips

Page 26: PowerCenter Developer: Tips & Tricks for Mapping Designer

26

Transformation Tips

• Build complex expressions and reuse them within repository

• Two Types: • Public: Callable from any transformation

expression

• Private: Only callable from another user-defined function

• Include any valid function except aggregate functions

• Can export to XML Files

User-Defined Functions

Page 27: PowerCenter Developer: Tips & Tricks for Mapping Designer

27

Transformation Tips

• Consider Source Qualifier with a filter to limit rows within relational sources

• Filter as close to the source as possible

• Replace multiple filters with a router

• Pertaining to routers, rows will go to each path where the criteria is TRUE

Filters/Routers

Page 28: PowerCenter Developer: Tips & Tricks for Mapping Designer

28

• Use sorted input to decrease use of aggregate caches

• Limit connected input/output or output port

• Filter data before aggregating

• Use as early as possible

Joiners

• Perform joins in Source Qualifier when possible

• Limit use to heterogeneous and flat file sources

• Perform normal joins when possible

• Join sorted input when possible

• Designate the master source as the source with fewer rows

Transformation Tips

Aggregators

Page 29: PowerCenter Developer: Tips & Tricks for Mapping Designer

29

Transformation Tips

• Using SQL Override in Lookup • Similar to Source Qualifier, avoid when possible

• Can apply Parameters and Variables

• Can query against multiple tables in same database

• Suppress ORDER BY statement by appending two dashes (--)

• Add indexes to database columns

• Replace large lookup tables with joins in the Source Qualifier when possible

• Relational Lookups should only return ports that meet the condition

• Remove all ports not used downstream or in the SQL Override

Lookups

Page 30: PowerCenter Developer: Tips & Tricks for Mapping Designer

30

Transformation Tips

• Lookup Cache Types

• Persistent Caches

• Save lookup cache files for reuse

• Dynamic Caches

• Retains the latest changes to data as rows pass through the mapping

• Updating a master table

• Real-time sessions

• Slowly changing dimension

• Cache Sizes

• Eliminate Paging

• Stores condition values in index, .idx, files

• Stores output values in data, .dat, files

• Apply the Cache Calculator in Session

Lookup Caches

Page 31: PowerCenter Developer: Tips & Tricks for Mapping Designer

31

Transformation Tips

• Cache Updates • Update the dynamic lookup cache with

results of an expression Use Case: Update QTY on hand for new timestamp

Add WHERE incoming row timestamp > cached timestamp

• SQL Overrides for Uncached Lookups

• You must choose the Use Any Value on Lookup Policy on Multiple Match condition

• Multiple Rows Return Use Case: Aggregate customer orders and

store the total value

• Database Deadlock Resilience

• NumOfDeadlockRetries

• DeadlockSleep

Lookups

Page 32: PowerCenter Developer: Tips & Tricks for Mapping Designer

32

Transformation Tricks

• Perform a lookup on an application source that is not a

relational table or file

• Partial pipeline contains Source & Source Qualifier but no target

• Integration Service reads source data and passes to Lookup Transformation to create cache

• Create partitions to improve performance

Pipeline Lookup

Page 33: PowerCenter Developer: Tips & Tricks for Mapping Designer

33

Transformation Tips

• Transaction in PowerCenter is a set of rows bound by commit or rollback

• Control commit and rollback transactions based on a row or set of rows that pass through the transformation

Use Case: Each invoice number is committed to the target database as a single transaction

• Change Tracing Level to ‘Terse’

• At higher tracing levels, every flush of the write buffers is logged

Transaction Control Transformation

Page 34: PowerCenter Developer: Tips & Tricks for Mapping Designer

34

Transformation Tips

Associated Source Qualifier

• Use ASQ when MQ data is flat file or COBOL

• ASQ is specific to the format of the MQ data

Page 35: PowerCenter Developer: Tips & Tricks for Mapping Designer

35

Transformation Tips

• Non-Reusable the counter is 0

• Performance will be affected if cached is low

• Increase of caching will improve the performance

• This doesn’t involve any database operation

• The caching allows to reserve number of rows in the memory

Sequence

Page 36: PowerCenter Developer: Tips & Tricks for Mapping Designer

36

File Source and Target Commands

Page 37: PowerCenter Developer: Tips & Tricks for Mapping Designer

37

File Source and Target Commands

Commands for File Sources

• Use a command to generate flat file source data

input rows or file list or a session

• Unix – any valid UNIX command or shell script

• Windows – any valid DOS or batch file.

• Service process variables ($PMSourceFileDir)

can be used in the command.

Page 38: PowerCenter Developer: Tips & Tricks for Mapping Designer

38

File Source Command

• Input Type: Command (default: file)

• Command Type: Command Generating File List

• Command writes list of file names to stdout

• PowerCenter interprets this as a file list.

Generating a File List

Page 39: PowerCenter Developer: Tips & Tricks for Mapping Designer

39

File Source Command

• Input Type: Command (default: file)

• Command Type: Command Generating Data

• Command generates rows to stdout

• Flat file reader reads directly from stdout

• Removes need for staging data

• Example use, reading compressed files

• uncompress –c $PMSourceFileDir/myCompressedFile.Z

Generating Source Data

Page 40: PowerCenter Developer: Tips & Tricks for Mapping Designer

40

File Target Command

• Output Type: Command (default: file)

• Flat file writer writes to the command

• Writing compressed files

• compress -c - > $PMTargetFileDir/myCompressedFile.Z

• Sorting output data

Processing Target Data

Page 41: PowerCenter Developer: Tips & Tricks for Mapping Designer

41

Filename Port

• Input Filename can be processed and passed

to target

Source Filename

Page 42: PowerCenter Developer: Tips & Tricks for Mapping Designer

42

Filename Port

• Write records to a dynamically named flat file

Target Filename

Page 43: PowerCenter Developer: Tips & Tricks for Mapping Designer

43

Change data detection

Page 44: PowerCenter Developer: Tips & Tricks for Mapping Designer

44

Change Detection for Updates

• Challenge: a record with a lot of columns needs

to be checked for changes

• Solution: calculate an MD5 checksum on the

columns and use a lookup to compare the value

with any existing record

MD5 or CRC32

Page 45: PowerCenter Developer: Tips & Tricks for Mapping Designer

45

Sample Change data detection

• Calculate MD5 for all columns except key

• Create lookup for primary key and MD5 value

• Perform insert/update, store MD5 value in target

Page 46: PowerCenter Developer: Tips & Tricks for Mapping Designer

46

Use of Metadata

Page 47: PowerCenter Developer: Tips & Tricks for Mapping Designer

47

Querying the PowerCenter repository

• Query in designer

• Limit querying on OPB tables

• Use the MX views instead

• Utilize Reporting Service

• Use Meta Query tool

• Use Batch Web Services

Page 48: PowerCenter Developer: Tips & Tricks for Mapping Designer

48

Reporting Service Dashboard

Page 49: PowerCenter Developer: Tips & Tricks for Mapping Designer

49

Repository Maintenance

Page 50: PowerCenter Developer: Tips & Tricks for Mapping Designer

50

Repository Maintenance

Purge repository versions

• Define version strategy for Dev, QA and Prod

• Archieve if required for future analysis

• Purge unwanted versions

• Run the purge in regular interval daily, weekly or monthly

pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n

$ADMIN_USER -X INFA_ENCRYPTED_PASSWD

pmrep purgeversion -n $VERSIONS_TO_KEEP -o $FILE_NAME

Page 51: PowerCenter Developer: Tips & Tricks for Mapping Designer

51

Repository Maintenance

Purge repository logs

• Define log strategy for Dev, QA and Prod

• Archieve if required for future analysis

• Purge unwanted logs

• Run the purge in regular interval daily, weekly or monthly

Compute statistics on metadata tables

pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD

pmrep truncatelog -t $DAYS_TO_KEEP

pmrep updatestatistics

Page 52: PowerCenter Developer: Tips & Tricks for Mapping Designer

52

Additional Informatica Resources

Refer the following...

• http://mysupport.informatica.com

• http://velocity.informatica.com/

• http://marketplace.informatica.com

• Product manuals

• Informatica Professional Services

Page 53: PowerCenter Developer: Tips & Tricks for Mapping Designer

53

Questions?