53

PowerCenter Developer - Informatica

  • Upload
    others

  • View
    30

  • Download
    0

Embed Size (px)

Citation preview

2

PowerCenter Developer: Tips & Tricks for Mapping Designer

Lingaraju Ramasamy (Raju),

Technical Architecture Manager

Informatica Professional Services

3

Agenda

• Introduction

• Architecture Best Practices

• Mapping Tips & Tricks

• Transformation Techniques

• Use of Metadata

• Repository maintenance

• Q&A

4

Introduction

5

Presenter Contact

• Lingaraju Ramasamy (Raju)

[email protected]

• 408-368-2475 (Mobile)

• Technical Architecture Manager, Informatica

Professional Services

6

Architecture Best Practices

7

Architecture Best Practices

• Consistency • Applying consistent standards reduces long term complications

• Naming Conventions (Velocity)

• Descriptions

• Environments

• Documentation (Hyperlink to SharePoint)

• Modularity • Develop according to a modular design

• Common Error Handling

• Reprocessing

• Mapping Assistants

• Reusability • Focus on reuse to make quick and universal modifications

• Mapplets, Worklets, Transformations, reusable functions

8

Architecture Best Practices

• Scalability • Keep volumes in mind in order to create efficient mappings

• Caching

• Queries

• Partitioning

• Initial vs. Incremental Loads

• Simplicity • Multiple simple processes are often better than few complex

processes

• Multiple mappings

• Simple Queries

• Staging Tables

• Advantages

• Easy to develop, debug, maintain and debug

9

Sample Complex Mapping

EXP_SEQ_HEA

DER_HIERARC

HY_RECS11

EXP_SEQ_DET

AIL_HIERARCH

Y_RECS

SQ_SC_T_STR

_ATTR_OUTL_

ORG_WK

SC_T_STR_AT

TR_OUTL_LAN

G_FN (Oracle)

JAVA_GENERA

TE_MSGID

JAVA_GENERA

TE_SESSID

WSC_STR_OU

Tl_SV_OUTL

JNR_OUTL_HR

_HOL

JNR_OUTL_HR

S

UNI_OUTLET

SEQ_HR_HDR

SC_T_STR_AT

TR_OUTL_HRS

_WK1 (Oracle)

SC_T_STR_AT

TR_OUTL_TM_

FN (Oracle)

SC_T_STR_AT

TR_OUTL_ORG

_WK (Oracle)

SC_T_STR_AT

TR_OUTL_HRS

_WK (Oracle)

SC_T_STR_AT

TR_OUTL_HOL

_FN (Oracle)

SC_T_STR_AT

TR_OUTL_FN1

(Oracle)

SEQ_HOL

TC_TRANSACT

ION_RECS

RTR_HDR_DE

T_DATA

EXP_HDR_BOO

KEND

SEQ_PK_FK_O

UTLET

EXP_ACCOUNT

ING_REC_6

EXP_SEQ_STA

TUS_REC_5

EXP_SEQ_IDE

NTIFICATION_

REC_4

EXP_SEQ_DET

AIL_HOLIDAY_

RECS

EXP_HRS_BOO

KEND

EXP_SEQ_REC

_TM_ZONE

SEQ_HRS

EXP_SEQ_DAY

_OF_WEEK

EXP_LANG_BO

OKEND

JNR_OUTL_LA

NG_HOL_HRS

EXP_SEQ_HEA

DER_HOLIDAY

_RECS

EXP_SEQ_HEA

DER_TEAM_M

EMBER_RECS

EXP_SEQ_DET

AIL_TEAM_ME

MBER_RECS

SQ_TEAM_ME

MBER

EXP_TEAM_ME

MBER

SEQ_TEAM_ME

MBER

JNR_TEAM_ME

MBER

JNR_HRS_HDR

EXP_HRS_HDR

SQ_SC_T_STR

_ATTR_OUTL_

FN1

EXP_OUTL_BO

OKEND

SQ_SC_T_STR

_ATTR_OUTL_

HRS_WK

SEQ_ORG

JNR_ORGANIZ

ATION

SEQ_LANG

EXP_HOL_BOO

KEND

EXP_SEQ_SER

VICE_REC_3

EXP_HEADER_

HOURS

EXP_CREATE_

HDR_ELEMEN

TS

EXP_SEQ_REC

_GEO_TYPE

SQ_SC_T_STR

_ATTR_OUTL_

HOL_FN

EXP_SEQ_REC

_BASIC

SEQ_OTHERS

EXP_DETAIL_H

OURS

SRT

SQ_SC_T_STR

_ATTR_OUTL_

LANG_FN

SQ_SC_T_STR

_ATTR_OUTL_

HRS_WK1

EXP_CHK_NE

W_RECS

EXP_OUTL_OR

G_BOOKEND

HOL_LEVEL3 (F

lat File)

SEQ_OUTLET

EXP_PASS_TH

ROUGH

Multiple Sources –> WebServices

10

Simplified Complex Mapping

RTR_HDR_DE

T_DATA

EXP_DETAIL_L

ANG

EXP_CHK_NE

W_RECS

SEQ_OUTLET

LANG_DETAIL_

STR_ATTR_OU

TL_WK (Oracle)

LANG_HEADER

_SC_T_STR_A

TTR_OUTL_W

K (Oracle)

SQ_SC_T_STR

_ATTR_OUTL_

CHG_WK

SC_T_STR_AT

TR_OUTL_CHG

_WK (Oracle)

SC_T_STR_AT

TR_OUTL_LAN

G_FN (Oracle)

SC_LKP_GET_

MSGID

SC_EXP_CREA

TE_HDR_ELEM

ENTS

SRT_INCM_RE

CS

SEQ_LANG

SQ_SC_T_STR

_ATTR_OUTL_

LANG_FN

EXP_HEADER_

LANG

SEQ_OTHERS

EXP_PASS_TH

ROUGH

EXP_OUTL_BO

OKEND

JNR_OUTL_LA

NG_HOL_HRS

EXP_LANG_BO

OKEND

SQ_SC_T_STR

_ATTR_OUTL_

HRS_WK

SC_LKP_GET_

MSGID

SC_T_STR_AT

TR_OUTL_HRS

_WK (Oracle)

SC_EXP_CREA

TE_HDR_ELEM

ENTS

DETAIL_HOUR

S_SC_T_STR_

ATTR_OUTL_

WK1 (Oracle)

HOURS_HDR_

SC_T_STR_AT

TR_OUTL_WK (

Oracle)

EXP_SEQ_DAY

_OF_WEEK

EXP_SRC_BOO

KEND

SEQ_HRS

SQ_SC_T_STR

_ATTR_OUTL_

CHG_WK

SC_T_STR_AT

TR_OUTL_CHG

_WK1 (Oracle)

EXP_HRS_BOO

KEND

RTR_HDR_DE

T_DATA

EXP_CHK_NE

W_RECS

DAY_OF_WEEK

_SC_T_STR_A

TTR_OUTL_W

K2 (Oracle)

SRT_INCM_RE

CS

EXP_DETAIL_H

OURS

EXP_HEADER_

HOURS

JNR_OUTL_HR

S

SEQ_OTHERS

EXP_PASS_TH

ROUGH

SEQ_OUTLET

SC_T_STR_AT

TR_OUTL_CHG

_WK (Oracle)

EXP_CHK_NE

W_RECS

EXP_LANG_BO

OKEND

SC_T_STR_AT

TR_OUTL_LAN

G_FN (Oracle)

SEQ_OUTLET

LANG_DETAIL_

STR_ATTR_OU

TL_WK (Oracle)

RTR_HDR_DE

T_DATA

EXP_DETAIL_L

ANG

SQ_SC_T_STR

_ATTR_OUTL_

CHG_WK

SC_LKP_GET_

MSGID

SC_EXP_CREA

TE_HDR_ELEM

ENTS

SRT_INCM_RE

CS

LANG_HEADER

_SC_T_STR_A

TTR_OUTL_W

K (Oracle)

SEQ_LANG

SQ_SC_T_STR

_ATTR_OUTL_

LANG_FN

EXP_HEADER_

LANG

SEQ_OTHERS

EXP_PASS_TH

ROUGH

EXP_OUTL_BO

OKEND

JNR_OUTL_LA

NG_HOL_HRS

Staging 1 Staging 2

Staging 3 Staging –> WebServices

JAVA_GEN_MS

G_ID

SEQ_ID EXP_GET_SEQ

_NUM

SQ_SC_T_STR

_ATTR_SITE_F

N

WSC_STR_ATT

R_SAVE_SITE

EXP_SRC_BOO

KEND

JAVA_GEN_SE

SSID

SC_T_STR_AT

TR_SITE_FN1 (

Oracle)

SC_T_STR_AT

TR_SITE_FN (O

racle)

11

Mapping Tips & Tricks

12

Mapping Tips

• Sources and Targets

• Use shortcuts from shared folders

• Extract only what is necessary

• Limit reads on source

• Distinguish between similar sources and targets

• Example

• DIM_CUSTOMER1 = DIM_CUSTOMER_insert

• DIM_CUSTOMER2 = DIM_CUSTOMER_update

13

Mapping Tricks

Parameters & Variables • Reduce overhead of creating multiple mappings

• Replace hard coded values

• Use to incrementally extract data

Example

UpdateDateTime >= TO_DATE (‘$$PREV_RUN_TS’)

: SetVariable (‘$$CURR_RUN_TS, SESSSTARTTIME)

14

Mapping Tricks

Parameters & Variables

• Assign Parameter/Variable values in a Session

• Pass values from one session to a subsequent session in same workflow/worklet

• On Components Tab in Session Properties

• Use user-defined workflow/worklet variables

• Non-reusable Sessions only

15

Mapping Tricks

Built-in Mapping Variables

• Mapping Name

• Workflow Name

• Session Name

• Integration Service Name

• Repository Service Name

• Repository User Name

• Folder Name

• Session Run Mode

• Source Table Names

• Target Table Names

16

Mapping Tricks

Group Expression (Anchor transformation)

• Add expression transformation after a source qualifier and

before a target

• If source or target definition changes, reconnecting ports is

much easier

17

Mass Update

• pmrep massupdate

• Session properties

• Session config attributes

• Transformation instance attributes

• Session instance run time options

18

Mapping Assistants

Preview Data

• View Data

• Accommodate anomalies early

• Verification of extraction/loading strategies

• Type of Data

• Source/Targets

• Relational, Flat file

• XML Files

• For further analysis, use Informatica Analyst

• Analyze the content, quality and structure of source data

• Involves separate Profiling warehouse, client and reports

19

Mapping Assistants

Mapping Wizard • Pass-Through

• Slowly Changing Dimension

• Type 1 Dimension (No History)

• Type 2 Dimension (All History)

• Version Data

• Flag Current

• Effective Date Range

• Type 3 Dimension (Previous Versions)

Slowly Changing Dimension Template

20

Mapping Assistants

• Standardize specifications

• Enhance collaboration between analyst and developer

• Improve documentation & audit ability of business logic

Mapping Analyst for Excel (MAE)

Data Analyst

Defines Business Terms

Specifies Transformation Rules

Standardize Excel format

DI Developer

Augments, Tunes

Generated Mappings

from Specifications

Generate Mapping

Generate Specification

21

Mapping Assistants

• Define consistent methodology & structure for data integration projects

• Build custom wizard based on pattern without coding

• Generate multiple mappings at one time

• Document data flow

Mapping Architect for Visio (MAV)

DI Architect

Creates & Publishes

mapping template

DI Developer

Augments, Tunes

Generated Mappings

Generate Mappings

Informatica

Toolbar

Informatica

Stencil Drawing

Window

Template File

Parameter File

Publish Template

22

Mapping Assistants

Mapping Architect for Visio (MAV)

Case Study #1

•7 templates were used across 2 projects to generate 600 mappings

•97% of mappings were automatically generated and required no additional

changes

•3% needed to be manually modified or custom developed

Case Study #2

•1 template was used to create 150 mappings for a data migration project

along with PowerCenter sessions and workflows

•Total effort was less than one day

•Equivalent effort to create 150 mappings manually would have been 2

weeks (10x effort)

23

Transformation Techniques

24

• Apply Default Query when possible • Utilize SQ Attributes

(i.e., User Defined Join, Source Filter)

• Understand advantages and limitations of the SQL override PROS

• Utilize database optimizers (i.e., indexes, hints)

• Can accommodate complex queries

CONS

• Processing impacts database resources

• Lose transformation logic in metadata searched

• Unable to utilize Partitioning or Pushdown Optimization options

• Minimize complicated queries

• Add the SQL Override Query to the Description

Source Qualifier

Transformation Tips

25

• Understand Port process order • INs or IN/OUTs

• VARIABLEs

• OUTPUTs

• Reduce code complexity • Use local variables

• Redundant calculations

• Check previous record

• Provide comments (-- or //) in expressions

• Optimize Expressions • Numeric operations are faster than string operations

• Operators are faster than functions

• Un-Nest complicated logic (use IIF or DECODE)

Expressions

Transformation Tips

26

Transformation Tips

• Build complex expressions and reuse them within repository

• Two Types: • Public: Callable from any transformation

expression

• Private: Only callable from another user-defined function

• Include any valid function except aggregate functions

• Can export to XML Files

User-Defined Functions

27

Transformation Tips

• Consider Source Qualifier with a filter to limit rows within relational sources

• Filter as close to the source as possible

• Replace multiple filters with a router

• Pertaining to routers, rows will go to each path where the criteria is TRUE

Filters/Routers

28

• Use sorted input to decrease use of aggregate caches

• Limit connected input/output or output port

• Filter data before aggregating

• Use as early as possible

Joiners

• Perform joins in Source Qualifier when possible

• Limit use to heterogeneous and flat file sources

• Perform normal joins when possible

• Join sorted input when possible

• Designate the master source as the source with fewer rows

Transformation Tips

Aggregators

29

Transformation Tips

• Using SQL Override in Lookup • Similar to Source Qualifier, avoid when possible

• Can apply Parameters and Variables

• Can query against multiple tables in same database

• Suppress ORDER BY statement by appending two dashes (--)

• Add indexes to database columns

• Replace large lookup tables with joins in the Source Qualifier when possible

• Relational Lookups should only return ports that meet the condition

• Remove all ports not used downstream or in the SQL Override

Lookups

30

Transformation Tips

• Lookup Cache Types

• Persistent Caches

• Save lookup cache files for reuse

• Dynamic Caches

• Retains the latest changes to data as rows pass through the mapping

• Updating a master table

• Real-time sessions

• Slowly changing dimension

• Cache Sizes

• Eliminate Paging

• Stores condition values in index, .idx, files

• Stores output values in data, .dat, files

• Apply the Cache Calculator in Session

Lookup Caches

31

Transformation Tips

• Cache Updates • Update the dynamic lookup cache with

results of an expression Use Case: Update QTY on hand for new timestamp

Add WHERE incoming row timestamp > cached timestamp

• SQL Overrides for Uncached Lookups

• You must choose the Use Any Value on Lookup Policy on Multiple Match condition

• Multiple Rows Return Use Case: Aggregate customer orders and

store the total value

• Database Deadlock Resilience

• NumOfDeadlockRetries

• DeadlockSleep

Lookups

32

Transformation Tricks

• Perform a lookup on an application source that is not a

relational table or file

• Partial pipeline contains Source & Source Qualifier but no target

• Integration Service reads source data and passes to Lookup Transformation to create cache

• Create partitions to improve performance

Pipeline Lookup

33

Transformation Tips

• Transaction in PowerCenter is a set of rows bound by commit or rollback

• Control commit and rollback transactions based on a row or set of rows that pass through the transformation

Use Case: Each invoice number is committed to the target database as a single transaction

• Change Tracing Level to ‘Terse’

• At higher tracing levels, every flush of the write buffers is logged

Transaction Control Transformation

34

Transformation Tips

Associated Source Qualifier

• Use ASQ when MQ data is flat file or COBOL

• ASQ is specific to the format of the MQ data

35

Transformation Tips

• Non-Reusable the counter is 0

• Performance will be affected if cached is low

• Increase of caching will improve the performance

• This doesn’t involve any database operation

• The caching allows to reserve number of rows in the memory

Sequence

36

File Source and Target Commands

37

File Source and Target Commands

Commands for File Sources

• Use a command to generate flat file source data

input rows or file list or a session

• Unix – any valid UNIX command or shell script

• Windows – any valid DOS or batch file.

• Service process variables ($PMSourceFileDir)

can be used in the command.

38

File Source Command

• Input Type: Command (default: file)

• Command Type: Command Generating File List

• Command writes list of file names to stdout

• PowerCenter interprets this as a file list.

Generating a File List

39

File Source Command

• Input Type: Command (default: file)

• Command Type: Command Generating Data

• Command generates rows to stdout

• Flat file reader reads directly from stdout

• Removes need for staging data

• Example use, reading compressed files

• uncompress –c $PMSourceFileDir/myCompressedFile.Z

Generating Source Data

40

File Target Command

• Output Type: Command (default: file)

• Flat file writer writes to the command

• Writing compressed files

• compress -c - > $PMTargetFileDir/myCompressedFile.Z

• Sorting output data

Processing Target Data

41

Filename Port

• Input Filename can be processed and passed

to target

Source Filename

42

Filename Port

• Write records to a dynamically named flat file

Target Filename

43

Change data detection

44

Change Detection for Updates

• Challenge: a record with a lot of columns needs

to be checked for changes

• Solution: calculate an MD5 checksum on the

columns and use a lookup to compare the value

with any existing record

MD5 or CRC32

45

Sample Change data detection

• Calculate MD5 for all columns except key

• Create lookup for primary key and MD5 value

• Perform insert/update, store MD5 value in target

46

Use of Metadata

47

Querying the PowerCenter repository

• Query in designer

• Limit querying on OPB tables

• Use the MX views instead

• Utilize Reporting Service

• Use Meta Query tool

• Use Batch Web Services

48

Reporting Service Dashboard

49

Repository Maintenance

50

Repository Maintenance

Purge repository versions

• Define version strategy for Dev, QA and Prod

• Archieve if required for future analysis

• Purge unwanted versions

• Run the purge in regular interval daily, weekly or monthly

pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n

$ADMIN_USER -X INFA_ENCRYPTED_PASSWD

pmrep purgeversion -n $VERSIONS_TO_KEEP -o $FILE_NAME

51

Repository Maintenance

Purge repository logs

• Define log strategy for Dev, QA and Prod

• Archieve if required for future analysis

• Purge unwanted logs

• Run the purge in regular interval daily, weekly or monthly

Compute statistics on metadata tables

pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD

pmrep truncatelog -t $DAYS_TO_KEEP

pmrep updatestatistics

52

Additional Informatica Resources

Refer the following...

• http://mysupport.informatica.com

• http://velocity.informatica.com/

• http://marketplace.informatica.com

• Product manuals

• Informatica Professional Services

53

Questions?