080319 JPDijcks Warehouse Builder Tips and Tricks

Embed Size (px)

Citation preview

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    1/35

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    2/35

    Topics

    File capabilities

    Advanced SQL capabilities

    Multi-configuration

    Match/Merge Capabilities

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    3/35

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    4/35

    Binary file loading

    How do I load a binary file into Oracle?

    Easy, just use Warehouse Builder and its capabilities

    How? Lets see

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    5/35

    Binary file loading (SQL Loader)

    1. Data file 2. Sample Definition 3. Configure Byte Order

    4. Create and Run mapping

    5. Compare Data

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    6/35

    Binary file loading (external table)

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    7/35

    Multi-file loading

    What is it?

    Using built in SQL Loader functionality to reuse a mapping Use loops in Process Flow to facilitate

    Available in OWB 10gR2

    Added the execution time variable to change the file name

    Allow process flows to accept and pass variables and do thelooping over the required files

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    8/35

    Multi-file loading

    More information: http://blogs.oracle.com/warehousebuilder/newsItems/viewFullItem$149

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    9/35

    New in OWB 10gR2Complex Document Support

    OWB 10gR2 release introduced Extended data type and object type support

    XMLTYPE, object types, collections etc.

    Import of complex object models

    ETL operators to support complex types

    Expand, Iterate, Construct

    Any expression (EXTRACT/XMLFOREST etc.)

    Pluggable Mapping Components

    Reusable ETL

    Experts

    Macro-like accelerator framework

    Automation of complex tasks

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    10/35

    OWB and XML Data

    XMLWeb Service(for example)

    XML

    OWB

    Map

    XSD

    instanceof

    File

    WS

    DB

    Table

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    11/35

    Example generates SQL XML extract/extractValue.

    Information ExtractionLeverage XDB SQL XML

    Generate

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    12/35

    Encapsulate Common LogicPluggable Mappings

    XMLSequence iterate extract

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    13/35

    Generate PatternsExample Components from XSD

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    14/35

    Generated ComponentExample Transaction837

    Operator attributes for Element Attributes

    Operator attributes for child associations X

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    15/35

    Advanced SQL Capabilities

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    16/35

    Generate mappings from SQL

    What is it?

    Did you ever wish you could generate a mapping out of those20 SQL statements you have lying around?

    How is it done?

    We created an expert that allows you to parse SQL and thengenerate mappings

    Note that this is not a product, but will help you

    Take it and use it to create more and more cool stuff in OWB

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    17/35

    Generate views from mapping

    What is it? Turn the mapping editor into a graphical view builder

    Allows you to choose between

    Federation

    Consolidation

    Available in OWB 10gR2 Get lineage and impact on all your views

    Allows for change propagation of data type changes

    Also consider linking existing views to their sourcetables for better lineage

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    18/35

    DML error logging

    What is it?

    No more restarting jobs

    No more worrying about that one faulty record that trips yourload

    Available in Oracle DB 10gR2 and OWB 10.2.0.3 Configurable per mapping

    Single or separate error tables

    High performance without the cost of restarting loads

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    19/35

    Advanced Aggregation

    Oracle introduced new aggregation functions

    CUBE

    ROLLUP

    With 10g Release 2 and 11g of OWB you can

    leverage these in your ETL environment

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    20/35

    D E M O N S T R A T I O N

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    21/35

    Multi-Configuration

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    22/35

    Multi-Configuration

    What is it?

    A way to stripe your physical information over multipleenvironments

    Manage (part of) your dev test prod cycles

    Available in OWB 10gR2 Much simpler to keep track of your environment

    Much simpler to keep the production system in a consistentstate

    A little bit hidden from your view

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    23/35

    Targets

    Dev

    Test

    QA

    Production

    OWB Design

    11g

    10g

    10g RAC

    10g RAC

    Prod

    QA

    Test

    Dev

    Multi-Configuration

    Single design repository

    Handle multiple target DBversions

    Transparently optimize codefor each version

    No recoding required

    Handle multiple OS targetenvironments

    Handle multiple securitysettings per target

    transparently

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    24/35

    Multi-Configuration

    But also:

    Use direct deployment from dev / test to their targets

    But ensure the settings for qa / prod are set in thedevelopment repository to ensure correct settings upon export

    Configurations are ALL exported

    Set the default in the receiving repository toimmediately pick up that configuration

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    25/35

    Multi-Configuration

    Things to think about:

    Create the appropriate locations:

    For DEV / TEST / PROD in the repository

    Create the same control centers in the repository

    Assign the right locations to the right control center Set security on the locations

    This prevents data viewing on production

    Dont give out the wrong passwords

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    26/35

    Matching and Merging

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    27/35

    StagingDataLayer

    Where do I worry about DQ?

    Operational data layer

    Performance data layer

    Handle DQ issues here

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    28/35

    Data Quality Fire Wall

    Cleanse:

    De-duplicate incoming data

    Fix data issues Name and address

    String comparisons

    Protect:

    Enforce referential integrity Enforce data rules

    Enforce data types andconversions

    Report Data issues

    Quality levels

    Quality trends

    Operational data layer

    ProtectCleanse Report

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    29/35

    Match/Merge Capabilities

    Source A: Packaged Apps Customer Table

    Source B: In-house App Customer Table

    Source C: Legacy App Customer File

    915-21-1234SmithMartinJonathanKI170384805

    SSNNAME_LNAME_MNAME_FSAP_CUST_IDA_CUST_SEQ

    Expected Result:

    Matching Rules:

    1. SSN: edit-distance match

    2. Name: soundex match

    3. SAP_CUST_ID: partial match

    4. XYZ_CUST_ID: exact match

    5. ABC_CUST_ID: exact match

    Merging Rules:

    1. SSN: most common

    2. NAME_L: most common

    3. NAME_M: longest

    4. NAME_F: most common

    5. SAP_CUST_ID: most common 7 in length

    6. A_CUST_SEQ: same as rec with the SSN

    Matching

    Merging

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    30/35

    Three Customer Sources

    Table A: Customers from ERP Application

    Table B: Customers from in-house database application XYZ

    Table C: Customers from legacy application ABC through flat file

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    31/35

    Step 1: Data Standardization

    Table B: Name parsing to separate First/Middle Name

    Table C: Create external table to access flat file

    All: Combine all data into single data stream (union)

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    32/35

    Step 2: Cross Table Matching

    Matching Rules:

    SSN: If SSN not null, not 999-99-9999, use edit distance matching

    SAP ID: If SAP_CUST_ID not null, use partial matching

    ABC ID: If ABC_CUST_ID not null, use exact matching Name: If NAME_F and NAME_L not null, use soundex matching

    XYZ ID: If XYZ_CUST_ID not null, use exact matching

    1

    1

    1

    1

    1

    1

    1

    1

    11

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    33/35

    Step 3: Cross Table Merging

    Merge Rules: SSN: Use the most common not-null SSN

    Name_M: Use the longest not-null middle name

    Name_F: Use the longest not-null first name from Table A

    SAP_CUST_ID: Use the most common SAP_CUST_ID with 7 digits

    A_CUST_SEQ: Use the CUST_SEQ from the record with merged SSN

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    34/35

    D E M O N S T R A T I O N

  • 8/6/2019 080319 JPDijcks Warehouse Builder Tips and Tricks

    35/35

    For More Information

    OWB on OTN:http://www.oracle.com/technology/products/warehouse/index.html

    Blog:http://blogs.oracle.com/warehousebuilder/

    Utility Exchange:http://www.oracle.com/technology/products/warehouse/htdocs/OWBexchange.html