51
0

Practical Tips to Improve Data Load Performance and Efficiency

Embed Size (px)

Citation preview

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 1/51

0

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 2/51

© 2010 Wellesley Information Services. All rights reserved.

Practical Tips to ImproveData Load Performanceand Efficiency

Joe DarlakComerit

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 3/51

2

In This Session ...

• Learn how to improve data loading performance by up to 75% by

applying proven optimization methods to data modeling;extraction, transformation, and loading (ETL) processes; andprocess chain design in SAP NetWeaver Business Warehouse.

• Discover how each decision made when architecting a BI system,

designing a data model, or developing ETL logic can have asignificant impact on data load performance.• Receive best practices to maximize data load performance while

reducing long-term maintenance costs, such as using portableETL code and eliminating hard-coding in your ETL logic.

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 4/51

3

In This Session ... Continued

• Find out how to enable version history to track code changes and

how to create reusable ETL logic to improve throughput andreduce data load time.• Get tips on when and how to use customer exits in DataSources

and variables to manage risk and reduce maintenance costs.•

Identify the challenges and benefits of semantic partitioning andthe importance of efficient data models.• Take home a checklist to ensure your data models are optimally

designed.

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 5/51

4

What We‘ll Cover …

• How to leverage the BW architecture• Data Modeling• ETL – Extraction• ETL – Transformation• ETL – Load (Process Chains)• Wrap-up

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 6/51

5

How to Leverage BW ETL Architecture 1

• Implement a Layered Scalable Architecture (LSA):

Create multiple data warehouse layers (a DSO layer)Employ the ―touch it, take it‖principleEnable deltas to subsequent data targets to add stability

Eliminate reporting impact due to overlapping requests• Keep data normalized to reduce redundancy

Header ODS and detail ODS• De-normalize to improve performance•

Limit transformation logic on extractMinimize risk of re-load because of your logic issues• Lookups on extract reduce timing constraints on loads

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 7/516

How to Leverage BW ETL Architecture 2

• Illustration: Sample Dataflow Diagram

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 8/517

What We‘ll Cover …

• How to leverage the BW architecture• Data Modeling• ETL – Extraction• ETL – Transformation• ETL – Load (Process Chains)• Wrap-up

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 9/518

Data Modeling 1: Overview

• Data modeling is still important!!!

BWA does not give license to design poorlyData model design still impacts load performanceBWA memory is expensive (licensing, H/W and service cost)

• Manage granularityDo not add free text fields to cubesMinimize use of different dates and/or document/line item detailUse Report-to-report interface to provide details when needed

Think aheadSemantic partitioningData retention policy (archiving)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 10/519

Data Modeling 2: Defining Dimensions

• Use as many dimensions as possible

Separate common filter characteristics into own dimension• Use line-item dimensions for high cardinality characteristics

Do not set the high cardinality flag!• Define related characteristics in the same dimension

Calculate expected number of dimensional entriesTry not to exceed 10% of expected fact table entries

• Add all relevant time characteristics

If 0CALMONTH is lowest granularity, add 0CALMONTH2,0CALQUARTER, 0CALQUART1, 0HALFYEAR and 0CALYEARProvides greatest reporting flexibility without need to reload

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 11/5110

Data Modeling 3: Semantic Partitioning

• What is it?

An architectural design to enable parallel data loading andquery executionPartitioning criteria: Year, Region or Actual/Plan

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 12/5111

Data Modeling 4: Semantic Partitioning

• Benefits of Semantic Partitioning:

Reduction in BWA footprint (when partitioned by year)Parallel data loading (when not partitioned by year)Parallel query execution

Best case when partitioning criterion is set as constantAlmost as good to create variables to filter on 0INFOPROV

Archival of a single InfoCube does not impact othersEasier DB maintenance

Performance benefits are so significant…Semantic Partitioning should be deployedon virtually every data model!

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 13/5112

Data Modeling 5: Semantic Partitioning

• Example: Semantic partitioning by year

DatasourceEx: Current Year + 1 = 2010

Current Year = 2009Current Year - 1 = 2008Current Year - 2 = 2007Current Year - 3 = 2006

MultiProvider

Current Year - 1 Current Year Current Year + 1Current Year - 2Current Year – 3

Current Year - 1 Current Year Current Year + 1Current Year - 2Current Year – 3

ALL yearsWrite-Optimized (No SIDs)

History(Summarized)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 14/5113

Data Modeling 6: Data Retention Policy

• Develop and implement a data retention strategy to effectively

manager data as it ages• Use a combination of approaches:

Aggregated history cubesNear-line storage

Traditional archivingData deletion

Up front planning will significantly reduceimplementation cost later and allow for acommon scalable approach

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 15/5114

What We‘ll Cover …

• How to leverage the BW architecture• Data Modeling• ETL – Extraction• ETL – Transformation• ETL – Load (Process Chains)• Wrap-up

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 16/5115

Extraction 1 Overview

• Focus on R/3 extraction• SAP delivers over 1,000 pre-developed DataSource

Still doesn‘t cover all SAP extraction requirementsIn addition, custom tables and customer-enhanced tables needtheir own extractors or enhancements to delivered ones

• Defining a flexible yet consistent strategy to deal with the manydifferent extraction scenarios you will face is an important up-front task for any BW project and/or architect

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 17/5116

Extraction 2: To Enhance Or Not To Enhance?

• Enhance business content (create user exit) if:

DataSource is delta-enabledExtraction method is ―function module‖ (i.e., LISextractors)Extraction method is ―View‖ and required fields do not

exist in base tables or check tables (or other joinabletables)

• Create a generic DataSource if:New view could contain all necessary fields

Function module can be copied and modified to providebetter performance

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 18/51

17

Extraction 3: Coding Tips – Dynamic Calls

• Code the extractor user exits so that they call a dynamic

program per DataSourceIsolate the code per DataSource in a self-containedprogramMinimize risk that a syntax error in code for one

DataSource impacts exctraction from all other DataSources• Example

Program name = ‗YBW‘ + <DataSource name>Form name = ‗DOYBW‘ + <DataSource name>

• This same technique can be used with customer exit variablecode

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 19/51

18

Extraction 4 : User Exit: Program Calls

• Illustration: Sample dynamic program call

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 20/51

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 21/51

20

Extraction 6: User Exit: Field Symbols

• Illustration: Sample use of field symbols

User Exit (without field-symbols)

REPORT YBWZDS_AGR_USER.

*********************************************************************

* Form called dynamically must start with DOYBW + <DataSource> *

*********************************************************************

FORM DOYBWZDS_AGR_USER

TABLES C_T_DATA STRUCTURE ZOXBWD0001.

data: l_logsys type logsys.

l_s_data like ZOXBWD0001.

field-symbols: <fs> like c_t_data.

select single logsys from t000

into l_logsys

where mandt = sy-mandt.

loop at c_t_data into l_s_data.

l_s_data-load_dt = sy-datum.

l_s_data-logsys = l_logsys.modify c_t_data from l_s_data index sy -tabix.

endloop.

ENDFORM.

User Exit (with field-symbols)

REPORT YBWZDS_AGR_USER.

*********************************************************************

* Form called dynamically must start with DOZBW + <DataSource> *

*********************************************************************

FORM DOYBWZDS_AGR_USER

TABLES C_T_DATA STRUCTURE ZOXBWD0001.

data: l_logsys type logsys.

field-symbols: <fs> like c_t_data.

select single logsys from t000

into l_logsys

where mandt = sy-mandt.

loop at c_t_data assigning <fs>.

<fs>-load_dt = sy-datum.

<fs>-logsys = l_logsys.

endloop.

ENDFORM.

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 22/51

21

Extraction 7: Generic DataSources

• Improve extract performance by creating delta-enabled generic

DataSources• Simple:

By dateBy timestamp

By sequential number (unique table key)• Complex:

Pointers – ABAP techniques can be used to record an array of pointers to identify new and changed records

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 23/51

22

Extraction 8: Generic DataSources

• Illustration: Delta enabling a generic DataSource

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 24/51

23

Extraction 9: Architecture Tip

• Need to update an ODS or master data from multiple

sources?Rather than enhancing business content, consider usingmultiple DataSources to load a single BW Object —as longas ODS, MD key is available to both DataSources

Decrease regression testingMitigate risk of re-loading delta initializations (win-win if delta extractor is an LIS extractor)Perform a single activation, if ODS, or single attribute

change run, if master data

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 25/51

24

What We‘ll Cover …

• How to leverage the BW architecture•

Data Modeling• ETL – Extraction• ETL – Transformation• ETL – Load (Process Chains)• Wrap-up

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 26/51

25

Transformation 1: Overview

• Common needs for transforming data:

AggregationDisaggregation (i.e., time-distribution)ConversionValidation

Filtering/deletionCreation (result tables)Lookups/merging

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 27/51

26

Transformation 2: Use 3.x or 7.x Technology?

• Architecture decision:

Transfer Rules and Update Rules (3.x)?Or Transformations (7.x)?

• Transfer Rules and Update Rules are stable and provenEasier to track through the system (retain same technical id)

Offer better performanceFewer transport bugs

• Transformations are new and improveAppear to be more flexible (perception only?)Visually more appealingThe long-term standard?

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 28/51

27

Transformation 3: Transfer Rules

• Architecture:

Only one InfoSource per DataSourceTransfer routines are record by record —no consolidation, noresults tableChanges to number of records in start routine are not reflected

in the PSA—there is difficulty linking error messages frommonitor entries back to PSA records• If communication structure feeds multiple data targets, then this

is the logical place for common transformations•

Use the start routine to maintain entire data package at one time(good place to use field symbols)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 29/51

28

Transformation 4: Master Data Transfer Rules

• If an InfoObject requires a common transformation across the

warehouse, code it in the InfoObject definition• The transfer routine will now be available in all transfer rules

where the InfoObject is used You need to re-activate pre-existing transfer rules for a newly

added InfoObject routine to be recognized• Allows for global conversion and/or validation of master data

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 30/51

29

Transformation 5: Master Data Transfer Rules

• Illustration: InfoObject Definition with Transfer Routine

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 31/51

30

Transformation 6: Architecture Tips

• Consider designing Level 1 ODS Objects to contain all possible

fields from source (if not LIS DataSource)Minimize maintenance and downtime later to add fields andpopulate in live environmentODS Level 1 objects can then become the source for lookups

from other updates, thereby reducing redundant reads of source tables in R/3• Master data to multiple targets? Use flexible update rules

Default communication structures for InfoObjects are the

attribute tables —here you can define custom ones and useupdate rules from them to multiple data targets

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 32/51

31

Transformation 7: Lookups

• Do not use single selects for lookups!•

For better performance:Use start routines to read lookup data to an internal tableRead internal table to populate field values in routines

• For best performance:

Add lookup fields to InfoSourceUse start routine and field symbols to populate blankfields for entire data package at one time (see illustrationfor DataSource user exit above)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 33/51

32

Transformation 8: Program Includes

• Use includes for all complex routine logic•

Access logic by using ―perform‖ statements• Increase portability of transformation logic

Use same read statements for multiple lookupsReduce risk of errors in obscure places

• Decrease maintenance cost of complex update rulesOne place to go to fix/enhance logicCode is consistent and easier to follow

• Enable version management of codeTrack changes over timeCompare between systemsRevert to previous versions

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 34/51

33

Transformation 9: Program Includes

• Illustration – Select into internal table

Start routine

FORM startup

TABLES MONITOR STRUCTURE RSMONITOR "user defined monitoring

MONITOR_RECNO STRUCTURE RSMONITORS " monitoring with record n

DATA_PACKAGE STRUCTURE DATA_PACKAGE

USING RECORD_ALL LIKE SY-TABIX

SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS

CHANGING ABORT LIKE SY-SUBRC. "set ABORT <> 0 to cancel update

*

*$*$ begin of routine - insert your code only below this line *-*

* fill the internal tables "MONITOR" and/or "MONITOR_RECNO",

* to make monitor entries

perform READ_USR02_TO_MEMORY_FOR_0BWTC_C02

TABLES MONITOR

DATA_PACKAGE

USING RECORD_ALL

SOURCE_SYSTEM

CHANGING ABORT.

* if abort is not equal zero, the update process will be canceled

* ABORT = 0.

*$*$ end of routine - insert your code only before this line *-*

Update include

************************************************************************

* INITIALIZATION (ONE-TIME PER DATA PACKET) ****************************

* TO READ FROM DATABASE (ALL RECORDS FOR DATA PACKAGE) *****************

************************************************************************

* FORM READ_USR02_TO_MEMORY_FOR_0BWTC_C02

*----------------------------------------------------------------------*

Form READ_USR02_TO_MEMORY_FOR_0BWTC_C02

TABLES MONITOR STRUCTURE RSMONITOR "user defined monitoring

DATA_PACKAGE STRUCTURE /BIC/CS80BWTC_C02

USING RECORD_ALL LIKE SY-TABIX

SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS

CHANGING ABORT LIKE SY-SUBRC. "ABORT<>0 cancels update

* REFRESH ALL INTERNAL TABLES.

REFRESH: GT_USR02.

* READ USR02 user data to memory

select * into corresponding fields of table GT_USR02

from USR02

FOR ALL ENTRIES IN DATA_PACKAGE

where BNAME = DATA_PACKAGE-TCTUSERNM

order by primary key.

* if abort is not equal zero, the update process will be canceled

ABORT = 0.

ENDFORM. "READ_USR02_TO_MEMORY_FOR_0BWTC_C02

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 35/51

34

Transformation 10: Program Includes

• Illustration – Include perform statements

Update routine

FORM compute_key_field

TABLES MONITOR STRUCTURE RSMONITOR "user defined monitoring

USING COMM_STRUCTURE LIKE /BIC/CS0BWTC_C02

RECORD_NO LIKE SY-TABIX

RECORD_ALL LIKE SY-TABIX

SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS

CHANGING RESULT LIKE /BI0/V0BWTC_C02T-USERGROUP

RETURNCODE LIKE SY-SUBRC

ABORT LIKE SY-SUBRC. "set ABORT <> 0 to cancel update

*

*$*$ begin of routine - insert your code only below this line *-*

* fill the internal table "MONITOR", to make monitor entries

PERFORM READ_GT_USR02

USING COMM_STRUCTURE-TCTUSERNM

RECORD_NO

RECORD_ALL

SOURCE_SYSTEM

CHANGING GS_USR02

ABORT.

RESULT = GS_USR02-CLASS.

* if abort is not equal zero, the update process will be canceled

*$*$ end of routine - insert your code only before this line *-*

ENDFORM.

Update include

************************************************************************

* RECORD PROCESSING (RUN PER RECORD) ***********************************

* TO READ FROM MEMORY (ONE RECORD) *************************************

************************************************************************

* FORM READ_GT_USR02

*----------------------------------------------------------------------*

Form READ_GT_USR02

USING TCTUSERNM LIKE USR02-BNAME

RECORD_NO LIKE SY-TABIX

RECORD_ALL LIKE SY-TABIX

SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS

CHANGING GS_USR02

ABORT LIKE SY-SUBRC. "set ABORT <> 0 cancel update

STATICS: L_RECORD LIKE SY-TABIX.

IF RECORD_NO <> L_RECORD.

L_RECORD = RECORD_NO.

clear GS_USR02.

* Read user data from internal table GT_USR02

read table GT_USR02

with key BNAME = TCTUSERNM

into GS_USR02.

ENDIF.

ENDFORM. "READ_GT_USR02

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 36/51

35

Transformation 11: Update Rules - Results Tables

• Need to ―create‖ data based on business logic•

Beware of hard-coding based on fields like document typesNew doc types can require enhancements/corrections to hard-coded logicSuch dependencies need to be communicated to business and

changes to logic need to become part of business process for creating doc types

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 37/51

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 38/51

37

Load 1: Process Chain Strategy

• Split loads by frequency and criticality

Separate daily loads from weekly, monthly, annual and ad-hocloadsWithin each frequency group, identify the critical path, andremove non-essential loads

Design chains based on Dataflow dependenciesRemember the dataflow diagram?

• Within each chain, take advantage of parallel processing wherever possible

Not all loads need to be sequential• Minimize parallel updates to BWA (competing changes to

common master data indexes can cause an abort)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 39/51

38

Load 2: Process Chain Tips

• Process chains require explicit scheduling of all load events

previously handled by the InfoPackageUse ―Only to PSA‖ and ―Subsequent Update‖ to reduce number of dialog processes spawned during loads

• If possible, schedule loads when users are off system

Can then delete indexes prior to loads and re-create after Will result in poor query performance during loads if not usingBWA or aggregates

• Schedule deletion of PSA data by process chain

Good rule of thumb is to delete data from PSA that is no longer recoverable (8-30 days)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 40/51

39

Load 3: Use Decision Variants

• Decision variants allow flexibility in chain logic•

For example, if you need to load a cube only on a specific day of the month, or month of the year:

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 41/51

40

Load 4: Performance Tips

• Reduce data packet transfer size if there is extensive use of

lookups in transfer/update rules• Use multiple loads with non-overlapping selection conditions v.

single loadsSome R/3 DataSources are not delta capable nor ODS

compatible —so they only support full loadsSeparate InfoPackages for actual and plan data by current andfuture years reduces full load sizeSet number of background processes accordingly

Turn off consistency check for proven loads from proven sources

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 42/51

41

Load 5: Error Handling

• If source data is frequently problematic, use error handling

Strips out error records into separate PSA or DTP to beprocessed later without impacting current loadCompletes processes of correct records

• Illustration: Error Handling in InfoPackage

d

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 43/51

42

Load 6: Partitioning

• Define partitioning strategy before you go-live

Cubes must be empty before they are partitioned by transportQuicker and less risky than using the repartitioning tool

• Partition by calendar month or fiscal periodQueries should use filters, variables or selections on the

partitioning column characteristic —read values from another variable if necessary

• SEM Transactional Cubes should also be partitioned

L d 7 C i

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 44/51

43

Load 7: Compression

• Compression should be scheduled regularly•

SAP recommendations for number of partitions:30-50 partitions (requests) per F-fact table120-150 partitions (time periods) per E-fact table —this is morethan 10 years by calendar month!

• Use zero-elimination during compressionCan greatly reduce number of fact table records for cubesloaded by ODS Objects or delta capable DataSourcesConsult OSS before using zero elimination —there are known

issues with specific database versions, although patches areavailable

L d 8 D L d S h d li S

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 45/51

44

Load 8: Data Load Scheduling Strategy

• Will the loads be scheduled by an external software?

Does the R/3 batch process use a external tool such asAutoSys?Consistent approach to batch scheduling could reduce overallsupport and maintenance costs

Will BW load success be monitored in BW or via the external tool?If using an external tool, need to develop a mechanism to reportsuccess/failure back to the toolIf using BW, consider adding text message notification steps to

process chains upon success/failure

L d 9 D L d S h d li S

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 46/51

45

Load 9: Data Load Scheduling Strategy

• Illustration: External scheduling process

External Scheduling Tool

BWProgram ZBW_PC_LOAD

Triggers event to executeprocess chain, and then

waits until it reports backa success or failure

Process Chain

Start event

FailureSuccess

Data Load

Wh W ‘ll C

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 47/51

46

What We‘ll Cover …

• How to leverage the BW architecture•

Data Modeling• ETL – Extraction• ETL – Transformation• ETL – Load (Process Chains)• Wrap-up

7 K P i t t T k H

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 48/51

47

7 Key Points to Take Home

• Intelligently managing data model granularity is critical toperformance —even with BW Accelerator!

• Implement Semantic Partitioning on every data model• Define a data retention strategy early on to lower TCO• Use dynamic programming for customer exits to simplify

maintenance and reduce risk of production impact• Use field symbols in the start routine to transform data to achieve

optimal performance• Use program includes to enable portability and version history for

your complex transformations• Define process chains based on frequency and the critical path

Use decision variants to improve flexibility

R

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 49/51

48

Resources

• Jens Doerpmund, ―Introducing the Layered, Scalable Architecture (LSA)Approach to Data Warehouse Design for Improved Reporting and AnalyticPerformance‖ (BI and Portals 2009)

• Jens Doerpmund, ―Beyond the Basics of SAP NetWeaver Business IntelligenceAccelerator‖ (BI and Portals 2009)

• Ron Silberstein, ―Data Modeling, Management, and Architectural Techniquesfor High Data Volumes with SAP Netweaver Business Intelligence‖ (BI andPortals 2008)

• Joe Darlak, ―Maximize the Capabilities, Efficiency and Performance of ETLLogic in BW‖ (ASUG Forums, October 2004)

• Ralph Kimball,The Data Warehouse Toolkit , (Wiley Publishing 2002)•

Rajiv Kalra, ―Conditional Execution‖ (BI Expert , March 2008)• John Kurgen, ―Use a New Process Type to Create Dynamic Process Chains‖ ( BI

Expert , January 2008)

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 50/51

Disclaimer

8/8/2019 Practical Tips to Improve Data Load Performance and Efficiency

http://slidepdf.com/reader/full/practical-tips-to-improve-data-load-performance-and-efficiency 51/51

Disclaimer

SAP, R/3, mySAP, mySAP.com, SAP NetWeaver ®, Duet™ , PartnerEdge, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other productand service names mentioned are the trademarks of their respective companies. Wellesley Information Services is neither owned nor controlled bySAP.