a - Advanced Topics

Embed Size (px)

Citation preview

  • 8/3/2019 a - Advanced Topics

    1/43

    Advanced TopicsInformatica

  • 8/3/2019 a - Advanced Topics

    2/43

    Topics of Discussion

    Identifying Bottlenecks

    Session Settings

    Some tips from experience

  • 8/3/2019 a - Advanced Topics

    3/43

    Identifying Bottlenecks

    Target

    Bottleneck

    SourceBottleneck

    Mapping

    BottleneckSession

    Bottleneck

    System

    Bottleneck

    Writing to a slow target?

    System not

    optimized?

    Reading from a slow source?

    Transformation inefficiencies?

    Session inefficiencies?

    Identify Bottleneck

    Performance

    ok?

    No

    Yes

    Done

  • 8/3/2019 a - Advanced Topics

    4/43

    Target Bottleneck

    Common causes of problems

    Indexes or key constraints

    Databases checkpoints

    Small database network packet size Too many target instances in the mapping

    Target table is too wide

    Common solutions

    Drop indexes and key constraints before loading and then

    rebuild Use bulk loading wherever practical

    Increase database network packet size

    Decrease the frequency of database checkpoints

    When using partitions consider partitioning the target table.

  • 8/3/2019 a - Advanced Topics

    5/43

    Source Bottleneck

    Common causes of problems Slow query

    Index issues

    Highly complex

    Small database network packet size

    Wide source tables

    Common Solutions

    Analyze the query with the help of show plan or other tools.

    Consider using database optimizer hints when joining

    several tables Consider indexing tables when you have order by or group

    by clauses

    Test source qualifier conditional filter versus filtering at thedatabase level

    Increase database network packet size

  • 8/3/2019 a - Advanced Topics

    6/43

    Mapping Bottleneck

    Common causes of problems

    Too many transformations

    Unused links between ports

    Too many input/output or output ports in an aggregator or

    ranking transformations Unnecessary datatype conversions

    Common Solutions

    Eliminate transform errors

    If several mappings read from the same source try singlepass reading

    Optimize datatypes use integers for comparison

    Dont convert back and forth between datatypes

    Optimize lookups and lookup tables, using cache andindexing tables

    Put the filters early in the dataflow, use simple filter

    condition

  • 8/3/2019 a - Advanced Topics

    7/43

    Mapping Bottleneck

    Common Solutions For aggregators use sorted input integer columns to group

    by and simplify expressions

    Use reusable sequence generators, increase number of

    cached values If you use the same logic in different data streams apply it

    before the streams branch off

    Optimize expressions

    Isolate slow and complex expressions

    Reduce or simplify aggregate functions Use local variables to encapsulate repeatedcomputations

    Integer computations are faster than charactercomputations

    Use operators rather than the equivalent functions, ||

    faster than CONCAT()

  • 8/3/2019 a - Advanced Topics

    8/43

    Session Bottleneck

    Common causes of problems

    Inappropriate memory allocation settings

    Running in series rather than in parallel

    Error tracing override set to high level

    Common Solutions Experiment with DTM buffer pool and buffer block size

    If your mapping allows it use partition

    Run sessions in parallel with concurrent batches,whenever possible

    Increase database commit interval Turn off recovery and decimal arithmetic (theyre off by

    default)

    Use debugger rather than high error tracing, always reduceyour tracing level during production runs

    Dont stage your data if you can avoid it

  • 8/3/2019 a - Advanced Topics

    9/43

    System Bottleneck

    Common causes of problems

    Slow network connections

    Overloaded or under-powered servers

    Slow disk performance Common Solutions

    Get the best machines to run the servers

    Use multiple CPUs and session partitioning

    Make sure informatica servers and database servers are

    closely located in your network If possible consider having informatica server and

    database server on the same machine

  • 8/3/2019 a - Advanced Topics

    10/43

    Identifying Bottlenecks

    Examining session results

    Read/write throughput

    Rows failed

    # of objects in the mapping Type of objects in the mapping

    Examining Parallelism, Partitioning

    How many objects In parallel/partitioned?

    What's the size of the hardware

    Source/Target/Database?

    What kind of pipeline is setup for sourcing and targeting?

  • 8/3/2019 a - Advanced Topics

    11/43

    Identifying Bottlenecks (2)

    Source SQL/Lookup SQL

    Group/Order bys

    Distinct clauses

    Where clause (filters) use of non-indexed fields Invalid plans

    Database issues

    Database connection configuration

    Database instance configuration

  • 8/3/2019 a - Advanced Topics

    12/43

    Identifying Bottlenecks (3)

    Aggregator Problems

    Too many multi level aggregates

    Joiner Problems

    Incorrect selection of master table Too many (or too wide) join columns

    Not tuning the data and index caches

    Rank Problems

    Not tuning the data and index caches

  • 8/3/2019 a - Advanced Topics

    13/43

    Identifying Bottlenecks (4)

    Source or Target problems

    Too many fields, width (precision) issues

    Implicit data conversions

    Update Strategies Too many targets per mapping

    No use of the bulk-loader

    Session Problems

    Not enough RAM given to the session

    Commit point too high

    Commit point too low

    Too many sessions running in parallel

  • 8/3/2019 a - Advanced Topics

    14/43

    Top 10 Mapping Bottlenecks

    1. Too many targets in a single mapping

    2. Data width is too large (too many columns passingthrough the mapping)

    3. Too many aggregators, lookups, joiners, ranks in themapping

    4. Not tuning data/index settings for the above objects

    5. Too many objects in a single mapping

    6. Unused ports in Cached lookups

    7. Source query/joins not tuned

    8. Lookup query/cache not tuned

    9. Ports passed through the mapping but not passed tothe target

    10. Huge expressions

  • 8/3/2019 a - Advanced Topics

    15/43

    Top 10 Session Mistakes

    1. Not controlling the log file override

    2. Not tuning the data and index caches for lookups, aggregators,ranks and joiners

    3. Not Tuning the commit point to match the database performancesetup

    4. Assuming that giving the mapping more memory will make it runfaster

    5. Assuming that increasing the commit point will make it run faster

    6. Not utilizing partitioning available

    7. Running too many sessions in parallel on an undersized, over-utilized machine

    8. Not architecting for failed rows

    9. Not setting the line buffer length for flat files

    10. Not testing the session for performance, when targeting a flat file

  • 8/3/2019 a - Advanced Topics

    16/43

    Problems with Map

    Multiple targets, single thread used for writeprocesses

    Multiple aggregators, single thread used for moving

    the data Stacked aggregators fight for memory, disk and cache

    directory in a single session

  • 8/3/2019 a - Advanced Topics

    17/43

    Problems with Maps (2)

    Filter condition is too lengthy not optimized

    Expression only performs a single calculation, forcingthe entire row processing when only the one field

    should be flowed through Disk contention is high 4 targets, single writer

    thread, I/O is a hotspot

  • 8/3/2019 a - Advanced Topics

    18/43

    Expression/Filter Contention

    Expression Filter Expression Filter

    IIF (EmpId=

    A and .. Or

    ..

    ..)

    B_Rowpass =

    IIF (EmpId=A and .. Or

    ...)

    B_Rowpass

    Expressions are built for evaluation speed

    Filters take a different code path (slower)

    Passing numeric integer to the filter keeps it fast (increases throughput

    by 0.5x and 3x)

    Filter expressions should be as simple as possible to maximize

    throughput

  • 8/3/2019 a - Advanced Topics

    19/43

    Aggregator Contention

    Aggregator1

    Aggregator1

    Aggregator1

    Aggregator1

    Aggregator1

    Aggregator1

    Aggregator1

    Parallel ExecutionSerial Execution

    Single map multiple aggregator

    Fight for disk I/O (Cache directory)Fight for RAM

    Multiple pass aggregation of the entire data set

    Session runs only as fast as the slowest

    aggregator

    All aggregation done in serial

    Splitting the Aggs across Maps

    Each map has its own I/O process threadsData is aggregated only once

    Parallelism is increased

    Amount of RAM per object is increased

    All aggregation is done in parallel

  • 8/3/2019 a - Advanced Topics

    20/43

    Update StragtegiesUpdate Target1

    Update Target2

    Update Target3

    If each row must be examined speed will be negatively

    impacted

    Remove the update strategies through parallel mappings

    Update strategies force each row to be analyzed

    Update strategies dont work against flat files

  • 8/3/2019 a - Advanced Topics

    21/43

    Steps to tuning

    Make a copy of the map for each target

    Remove all but one target from each copy

    Work backwards from the target to the sourceeliminate

    unused/unnecessary transformations

    Simplify the mapping

    Move the filters upstream to the source if possible

    Move a large cached lookup into a joiner on source feed

    Stage the target data if necessary, then use a bulk loader to

    mass insert at high speeds

    Tune the source SQL, and the session parameters

    Tune the DB connection and the RDBMS

  • 8/3/2019 a - Advanced Topics

    22/43

    What does Session Partitioning Do?

    It separates the data into physical blocks, basically

    reduces the amount of work that each load process has

    to dobut increases the number of load process that

    have to take placePartitioning is the method of splitting the data.

    Execution of the partitioned loads is the parallelization

    of the loading process

  • 8/3/2019 a - Advanced Topics

    23/43

    Source Horizontal PartitioningSource/Target

    A - L

    M - S

    T - Z

    Provides best read ranges if you

    know what data you are after

    Allows for process parallelism

    Breaks source data into

    manageable parts

    Caution: Requires additional

    maintenance

    Faster indexes

    Allows parallel queries to be run

  • 8/3/2019 a - Advanced Topics

    24/43

    Source Vertical PartitioningSource/Target

    Provides smaller network packets

    Allows increased parallelism /

    increased parallel reads

    Can provide better management

    over wide tables

    Potentially decreases I/Os on the

    source side

    Assists in the processing

    component of the data movement

    Cols

    0-100

    Cols

    200-300

  • 8/3/2019 a - Advanced Topics

    25/43

    Joiner Object

    Master reads rows first into data and index caches

    RAM is impacted depending on the rows read

    Joiner fields are read into D&I

    No control over D&I memory sizes

    Contrary to popular belief, the joiner is not slow (compared to

    cached lookup)

    The joiner is a powerful object for performance tuning

    especially when getting rid of cached lookups.

  • 8/3/2019 a - Advanced Topics

    26/43

    Joiner Contention

    Data and index cache must be calculated properly

    Master-Detail Join: master should be the smaller table

    Detail Outer Join: Master should be smaller table

    Full Outer Join: Both tables should be relatively smaller in

    size

    Good for heterogeneous sources

    Rarely necessary when staging table architecture is employed

    Consider the size of shared memory for large scale join

    operations

  • 8/3/2019 a - Advanced Topics

    27/43

    Lookup Object

    Initialization caches ALL data identified in the ports of the

    lookup, including the data to be matched on

    RAM is impacted depending on the number of rows read

    The lookups two flaws are: It fights for resources duringexecution and its initialization speed is dependent on the speed

    of the SQL beneath it.

    WIDE lookups can soak up a lot of time, and space,

    especially with the Order By clause thats

    generated/appended to the SQL.

  • 8/3/2019 a - Advanced Topics

    28/43

    Lookup ContentionLookups should always be cached when: small width andhuge number of rows (primary key only), or any width, and

    small number of rows (less than 100,000). The exception is: If

    the hardware has enough RAM to handle all the concurrent

    sessionsand the cached lookup, then it is cached.

    If the lookup is cached, the data and index cache are utilized.

    Uncached lookups should be used when: extremely large

    number of rows are sourced, or a wide table is sourced, or

    RAM is scarce

    An uncached lookup generates its own database connection

    An uncached lookup should always be retrieved by primary

    key

    Cached or Uncached the connection to the database should

    have the maximum packet size.

  • 8/3/2019 a - Advanced Topics

    29/43

    Aggregator Object

    Initialization caches ALL data identified in the ports of the

    aggregator

    RAM is impacted depending on the number of rows read

    The aggregator absorbs all rows before pushing them to thetarget

    WIDE Aggregates can soak up a lot of space, also, they

    can increase I/Os if the data isnt sorted on the way in.

  • 8/3/2019 a - Advanced Topics

    30/43

    Rank Object

    A lot like aggregator, reads all rows in to the data and index

    cache

    RAM is impacted depending on the number of rows read.

    WIDE Ranks can soak up space, also they can increase the

    I/Os if data isnt sorted on the way in. All rows must be

    evaluated to get the top or bottom X%.

  • 8/3/2019 a - Advanced Topics

    31/43

    Sort Object

    A lot like aggregator, reads all rows in to the data and index

    cache

    RAM is impacted depending on the number of rows read.

    WIDE rows can soak up space. Depending on whatssetup for sorting on, the index can also grow large.

  • 8/3/2019 a - Advanced Topics

    32/43

    Router Transformation

    Total of 6 passes for 3 filters. This can double the amount of

    work for passing a row.

    Target1

    Expression

    Filter1

    Filter2

    Filter3 Target3

    Target2

  • 8/3/2019 a - Advanced Topics

    33/43

    Router Transformation

    Compared to 6 passes of data, now there are 4. This will help

    improve the performance.

    Target1

    Expression

    Target3

    Target2

    Router

  • 8/3/2019 a - Advanced Topics

    34/43

    Why use Bulk-Loaders?

    Native (Internal) Connectivity

    Build row sets into RAM blocks

    Capable of bypassing logging mechanisms

    Capable of being run in parallel (Synchronized withthe RDBMS parallel engine)

  • 8/3/2019 a - Advanced Topics

    35/43

    Why NOT use Bulk-Loaders?

    Limited to FLAT FILE sourcing

    Provides for inserts only

    Requires rigid input structure

    Usually requires external scheduling resourcesMost loaders cannot be scripted within the database

    Complex logic for inserts can slow Loaders

    tremendously

  • 8/3/2019 a - Advanced Topics

    36/43

    Bulk-Loaders Modes

    Fast

    Slow

    The fast loads usually bypass R.I., and indexing.Even faster modes append to the target tables.

    Slow loads run direct through the engine

    (same as other applications)

    Loaders will switch modes if certain criteria arent

    met before the job starts.

  • 8/3/2019 a - Advanced Topics

    37/43

    Important session settings

    Shared Memory Size

    Buffer Block Size

    Line Width Size (If flat file source)

    Data and Index Cache sizesCommit Point

    Log Setting

    Power Center: Partition Settings

  • 8/3/2019 a - Advanced Topics

    38/43

    Session SpeedsSimple Map

    Simple Maps run faster given more memory

    There is a 1.8 GIG real mem limit to all maps

    The performance depends on how many maps are running

    in parallel

    Eventually too much parallelism will cause slow down of

    the entire system

    Even the simple maps run as fast as the source and the

    target

    Complex MapEach thread is linked by memory managementas the

    block sizes change, the thread speed slides

    Speed tests indicate best average performance achieved

    with 128k Block sizes and 24Mb RAM (Shared Memory)

  • 8/3/2019 a - Advanced Topics

    39/43

    Alternatives to a lookup

    Expression

    Lookup Expression

    Source Qual

    Joiner

    Lookups are to be used for relatively

    smaller tables

    Caches all the data identified in the

    ports

    Joiners cache only the keys based on

    which the join happens

  • 8/3/2019 a - Advanced Topics

    40/43

    Alternatives to a lookup

    Expression

    Lookup Expression

    Src Qual

    Joiner

    This is an unconnected lookup

    The lookup is used to get the values

    based on a key and a a code

    Agg

    1 1 Txt1

    1 2 Txt2

    1 3 Txt3

    1 Txt1 Txt2 Txt3

    2 txta txtb txtc

    Input O/P Result set

    Exp

    The lookup is replaced by the Source qualifier,

    expression, aggregator and the joiner.

    The expression is used to calculate the proper

    values and the aggregator is used to keep therecord with the accurate information (which in

    this case is the last record)

  • 8/3/2019 a - Advanced Topics

    41/43

    Group by Clause

    Src Qual Agg

    A group by clause can be replaced by an aggregator with a

    sorted input (if possible)

    If the data is coming as sorted then the aggregator is always

    faster than the group bys

  • 8/3/2019 a - Advanced Topics

    42/43

    Deletes

    Src Qual

    Src1

    Src2

    Upd Tgt(Src1)

    Source 2 is the driving table and Source 1

    is the table from which the data is deleted

    Normally this process gives the keys in the

    table source 1 from where the data is to be

    deleted based on some conditions and thenthe data is deleted accordingly

    Source 2 contains a few record on a key

    which forms a part of the composite

    primary key in table 1

    Source 2 is the only table that is used

    The target instance contains update over-

    ride, based on which the deletion happens

    on the particular key coming from the

    source table

    Src2 Src Qual Upd Tgt(Src1)

  • 8/3/2019 a - Advanced Topics

    43/43

    Thank YouJ