of 53 /53

PowerCenter Developer - Informatica

  • Author
    others

  • View
    3

  • Download
    0

Embed Size (px)

Text of PowerCenter Developer - Informatica

Informatica DeveloperLingaraju Ramasamy (Raju),
Technical Architecture Manager
Informatica Professional Services
• Naming Conventions (Velocity)
• Common Error Handling
• Reusability • Focus on reuse to make quick and universal modifications
• Mapplets, Worklets, Transformations, reusable functions
8
Architecture Best Practices
• Scalability • Keep volumes in mind in order to create efficient mappings
• Caching
• Queries
• Partitioning
• Simplicity • Multiple simple processes are often better than few complex
processes
9
• Limit reads on source
• Example
• Replace hard coded values
Example
• Assign Parameter/Variable values in a Session
• Pass values from one session to a subsequent session in same workflow/worklet
• On Components Tab in Session Properties
• Use user-defined workflow/worklet variables
before a target
much easier
18
• Analyze the content, quality and structure of source data
• Involves separate Profiling warehouse, client and reports
19
• Version Data
• Flag Current
Slowly Changing Dimension Template
• Improve documentation & audit ability of business logic
Mapping Analyst for Excel (MAE)
Data Analyst
• Generate multiple mappings at one time
• Document data flow
DI Architect
Creates & Publishes
mapping template
DI Developer
Augments, Tunes
Generated Mappings
Generate Mappings
Case Study #1
•7 templates were used across 2 projects to generate 600 mappings
•97% of mappings were automatically generated and required no additional
changes
Case Study #2
•1 template was used to create 150 mappings for a data migration project
along with PowerCenter sessions and workflows
•Total effort was less than one day
•Equivalent effort to create 150 mappings manually would have been 2
weeks (10x effort)
(i.e., User Defined Join, Source Filter)
• Understand advantages and limitations of the SQL override PROS
• Utilize database optimizers (i.e., indexes, hints)
• Can accommodate complex queries
• Unable to utilize Partitioning or Pushdown Optimization options
• Minimize complicated queries
Source Qualifier
Transformation Tips
• VARIABLEs
• OUTPUTs
• Redundant calculations
• Optimize Expressions • Numeric operations are faster than string operations
• Operators are faster than functions
• Un-Nest complicated logic (use IIF or DECODE)
Expressions
• Two Types: • Public: Callable from any transformation
expression
• Include any valid function except aggregate functions
• Can export to XML Files
User-Defined Functions
Transformation Tips
• Consider Source Qualifier with a filter to limit rows within relational sources
• Filter as close to the source as possible
• Replace multiple filters with a router
• Pertaining to routers, rows will go to each path where the criteria is TRUE
Filters/Routers
28
• Limit connected input/output or output port
• Filter data before aggregating
Joiners
• Limit use to heterogeneous and flat file sources
• Perform normal joins when possible
• Join sorted input when possible
• Designate the master source as the source with fewer rows
Transformation Tips
Transformation Tips
• Using SQL Override in Lookup • Similar to Source Qualifier, avoid when possible
• Can apply Parameters and Variables
• Can query against multiple tables in same database
• Suppress ORDER BY statement by appending two dashes (--)
• Add indexes to database columns
• Replace large lookup tables with joins in the Source Qualifier when possible
• Relational Lookups should only return ports that meet the condition
• Remove all ports not used downstream or in the SQL Override
Lookups
30
• Dynamic Caches
• Retains the latest changes to data as rows pass through the mapping
• Updating a master table
• Apply the Cache Calculator in Session
Lookup Caches
• Cache Updates • Update the dynamic lookup cache with
results of an expression Use Case: Update QTY on hand for new timestamp
Add WHERE incoming row timestamp > cached timestamp
• SQL Overrides for Uncached Lookups
• You must choose the Use Any Value on Lookup Policy on Multiple Match condition
• Multiple Rows Return Use Case: Aggregate customer orders and
store the total value
Transformation Tricks
• Perform a lookup on an application source that is not a
relational table or file
• Partial pipeline contains Source & Source Qualifier but no target
• Integration Service reads source data and passes to Lookup Transformation to create cache
• Create partitions to improve performance
Pipeline Lookup
Transformation Tips
• Transaction in PowerCenter is a set of rows bound by commit or rollback
• Control commit and rollback transactions based on a row or set of rows that pass through the transformation
Use Case: Each invoice number is committed to the target database as a single transaction
• Change Tracing Level to ‘Terse’
• At higher tracing levels, every flush of the write buffers is logged
Transaction Control Transformation
• Use ASQ when MQ data is flat file or COBOL
• ASQ is specific to the format of the MQ data
35
• Performance will be affected if cached is low
• Increase of caching will improve the performance
• This doesn’t involve any database operation
• The caching allows to reserve number of rows in the memory
Sequence
36
37
Commands for File Sources
input rows or file list or a session
• Unix – any valid UNIX command or shell script
• Windows – any valid DOS or batch file.
• Service process variables ($PMSourceFileDir)
38
• Command Type: Command Generating File List
• Command writes list of file names to stdout
• PowerCenter interprets this as a file list.
Generating a File List
• Flat file reader reads directly from stdout
• Removes need for staging data
• Example use, reading compressed files
• uncompress –c $PMSourceFileDir/myCompressedFile.Z
Generating Source Data
• Flat file writer writes to the command
• Writing compressed files
• compress -c - > $PMTargetFileDir/myCompressedFile.Z
• Sorting output data
Processing Target Data
to target
Source Filename
Target Filename
to be checked for changes
• Solution: calculate an MD5 checksum on the
columns and use a lookup to compare the value
with any existing record
• Create lookup for primary key and MD5 value
• Perform insert/update, store MD5 value in target
46
• Utilize Reporting Service
• Archieve if required for future analysis
• Purge unwanted versions
• Run the purge in regular interval daily, weekly or monthly
pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n
$ADMIN_USER -X INFA_ENCRYPTED_PASSWD
51
• Archieve if required for future analysis
• Purge unwanted logs
• Run the purge in regular interval daily, weekly or monthly
Compute statistics on metadata tables
pmrep connect -r $REPOSITORY_NAME -d $DOMAIN_NAME -n $ADMIN_USER -X INFA_ENCRYPTED_PASSWD
pmrep truncatelog -t $DAYS_TO_KEEP