13
SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com © 2011 SAP AG 1 Data Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information, visit the EDW homepage . Summary This document may help you in deciding the data loading strategy for global data warehousing implementation. Author: Nilesh Ramesh Ahir Company: IBM India Created on: 19 February 2011 Author Bio Nilesh Ahir has completed his masters in Software System from BITS Pilani. He has total 5 years of SAP experience. He has been working as SAP NW BI Application consultant for IBM India for last one year. Prior to this he was working with Intel India. He has experience in ABAP, BW3.5 / BI7.0 and Data mining. He has worked on other nonSAP technologies like NLS, TIBCO and web services.

Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Embed Size (px)

Citation preview

Page 1: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 1

Data Loading Strategy for Global

Data Warehousing Implementation

Applies to:

EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information, visit the EDW homepage.

Summary

This document may help you in deciding the data loading strategy for global data warehousing implementation.

Author: Nilesh Ramesh Ahir

Company: IBM India

Created on: 19 February 2011

Author Bio

Nilesh Ahir has completed his masters in Software System from BITS Pilani. He has total 5 years of SAP experience. He has been working as SAP NW BI Application consultant for IBM India for last one year. Prior to this he was working with Intel India. He has experience in ABAP, BW3.5 / BI7.0 and Data mining. He has worked on other non–SAP technologies like NLS, TIBCO and web

services.

Page 2: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 2

Table of Contents

Introduction ......................................................................................................................................................... 3

Basics of mother Earth ....................................................................................................................................... 4

Time Zones ......................................................................................................................................................... 5

Data Loading Strategy ........................................................................................................................................ 8

Advantages of this Type of Scheduling ............................................................................................................ 11

Additional Points ............................................................................................................................................... 11

System architecture requirements: ............................................................................................................... 11

How master data is handled: ........................................................................................................................ 11

Alternate approaches: ................................................................................................................................... 11

Disclaimer and Liability Notice .......................................................................................................................... 13

Page 3: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 3

Introduction

After realizing the advantages and benefits of single instance data warehousing implementation, most of the global companies are going for such kind of implementation. But it has been observed that a company whose business has been spread in many countries across the globe has major challenges in data availability in business warehouse for reporting purpose. This article will address same problem and will try to give you guideline for deciding the data load strategy that can be used for huge data warehouse implementation.

Scenario: A multinational company wants you to decide the data load strategy for them with below inputs.

Company has business in many countries across the globe

Huge amount of data is getting generated due to business transaction in various part of world

Reporting / decision making / analysis is region wise

ASAP Data availability at end of business day for each region in EDW for reporting

Single instance ERP (R/3) implementation with reusable ETL mechanism for EDW Applications

Data loads to be complete within time span of 8 hours to avoid over lapping issues or inconsistencies

due to partial data loads

Business hours / business day 9:00 AM to 6:00 PM

Landscape for single instance SAP BW and R/3

Landscape for single instance R/3 and DW implementation

Single Instance SAP R/3 System

Single instance SAP BW/ BI System

Page 4: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 4

Basics of mother Earth

Before going for designing the data load strategy for such huge implementation, let’s understand the basic facts of our mother earth.

Earth has been divided into 5 continents

1.ASIA 2.EUROPE 3.AFRICA 4.SOUTH AMERICA 5.NORTH AMERICA 6. ANTARTICA 7.AUSTRALIA

And three regions

1. ASIA-PAC 2. EUROPE-AFRICA 3. AMERICAS

Page 5: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 5

Time Zones

For measurement of time with respect to different regions across the globe, we have one reference line which divides globe into two equal halves. This line is known as prime meridian.

The Prime Meridian is the meridian (line of longitude) at which the longitude is defined to be 0°. The Prime Meridian and it’s opposite the 180th meridian (at 180° longitude), which the International Date Line generally follows, form a great circle that divides the Earth into the Eastern and Western Hemispheres.

An international conference in 1884 decided the modern Prime Meridian passes through Greenwich in southeast London, United Kingdom, known as the International Meridian or Greenwich Meridian, although the Prime Meridian is ultimately arbitrary unlike the parallels of latitude, which are defined by the rotational axis of the Earth with the Poles at 90° and the Equator at 0°. Time of this place is known as GMT.

Thus with reference to prime meridian, we can measure 12 hrs (equivalent to 180 deg) east or west direction.

Also we can calculate the local time of particular location if we know the present Greenwich Mean Time i.e. GMT

Page 6: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 6

Below picture talks about how much time Mumbai is ahead of London.

Thus if GMT = 08:00 AM , Mumbai time will be 1:39PM.

Similarly below picture is giving time with reference to GMT

Page 7: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 7

From time clock point of view, globe has been divided into time zones

We can divide globe into 12 time zones each equals to 2hrs equivalent to 30 deg.

1 2 3 4 5 6 7 8 9 10 11 12 Time Zone

Page 8: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 8

Data Loading Strategy

To make data available in data warehouse at the earliest possible, we have to extract data from R/3 and load it into BW at end of business day for each region rather than waiting for end of calendar day by having region wise data load schedule. As we know that there are three regions ASIA-PAC, EUROPE-AFRICA and AMERICAS. Therefore we can have below three schedules.

Region ASIA-PAC EUROPE-AFRICA AMERICAS

Schedule AP EA AM

Time zones 1,2,3,4 5,6,7,8 9,10,11,12

As in our scenario, business day ends at 6PM (local time), we can start loading data at 7:00PM (Keeping buffer of 1 hr) into Data warehouse. But here we are talking about local time of last time zone of particular region. That means if we want to start data loads for Asia-Pac region, then we have to start it at 7:00pm (local) of time zone 4. Thus in each schedule, data for four time zones are getting loaded.

AP (Asia-Pac) EA (Europe-Africa) AM (Americas) Schedule

7:00PM 7:00PM 7:00PM

Page 9: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 9

Difference between two schedules is 8 hours; therefore we have to optimize our ETL process to load data of particular region within time span of 8 hours to avoid overlapping issues.

Now check the location of country in which company has business and identify the time zone and corresponding schedule.

If organization has multiple companies registered in a country then name the company codes using country code followed by two digit sequence number.

Schedule Region Schedule Time – Local

Schedule Time - IST

Schedule Time - GMT

AP Asia-Pac 07:00PM 08:30PM 03:00PM

EA Europe-Africa 07:00PM 04:30AM 11:00PM

AM Americas 07:00PM 12:30PM 07:00AM

AP

EA

AM

8Hrs

8Hrs 8Hrs

Page 10: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 10

Complete list of country codes

http://www.nationsonline.org/oneworld/countrycodes.htm

Few examples for showing the relationships -

Country Country code Company code Time zone Region Schedule

Australia AU AU_01_01 1 Asia-Pac AP

Australia AU AU_02_01 2 Asia-Pac AP

Australia AU AU_03_01 3 Asia-Pac AP

China CN CN_01_01 3 Asia-Pac AP

China CN CN_02_01 4 Asia-Pac AP

India IN IN_01_01 4 Asia-Pac AP

South Africa ZA ZA_01_01 5 Europe-Africa EA

South Africa ZA ZA_02_01 6 Europe-Africa EA

Switzerland CH CH_01_01 6 Europe-Africa EA

Spain ES ES_01_01 7 Europe-Africa EA

Iceland IS IS_01_01 8 Europe-Africa EA

Argentina AR AR_01_01 9 Americas AM

United States US US_01_01 10 Americas AM

United States US US_02_01 11 Americas AM

United States-Alaska

US US_03_01 12 Americas AM

United States-Alaska

US US_03_02 12 Americas AM

Company code = <ISO country code> + ( _ ) underscore + <Time Zone – 2 digit> + ( _ ) underscore + <sequence number – 2 digit>

Example : for Australia comp code = AU_01_01,

AU_02_01,.. etc.

Page 11: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 11

Advantages of this Type of Scheduling

Lesser system load due to data distribution – improved system performance

Less time required for data loads

Faster Data availability as compare to end-of-calendar-day scheduling approach

Advantage over market competitors if followed below business model

Region Location of Decision makers Time Zone

Asia-Pac Americas 9

Europe-Africa Asia-Pac 1

Americas Europe-Africa 5

Additional Points

System architecture requirements:

Strategy will be very effective if you have used LSA (Layered Scalable Architecture). This can be used even if you do not have LSA implemented and you have BW implementation at region level.

These regional BW boxes are further supplying data to EDW for global level reporting. In this case as well, this strategy will be helpful.

In general, this strategy is useful if data is getting generated in different time zones across the globe.

How master data is handled:

Master data should be loaded before loading transaction data in each schedule.

Alternate approaches:

If company business is limited to particular region then end-of-calendar-day approach loading will be simple and effective.

Having regional data warehousing systems providing data to centralize global data warehouse, but this is very costly solution.

Page 12: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 12

Related Content

Complete list of country codes

For more information, visit the EDW homepage

Page 13: Data Loading Strategy for Global Data Warehousing ... Loading Strategy for Global Data Warehousing Implementation Applies to: EDW, SAP BIW 3.5, SAP NetWeaver 7.0. For more information,

Data Loading Strategy for Global Data Warehousing Implementation

SAP COMMUNITY NETWORK SDN - sdn.sap.com | BPX - bpx.sap.com | BOC - boc.sap.com

© 2011 SAP AG 13

Disclaimer and Liability Notice

This document may discuss sample coding or other information that does not include SAP official interfaces and therefore is not supported by SAP. Changes made based on this information are not supported and can be overwritten during an upgrade.

SAP will not be held liable for any damages caused by using or misusing the information, code or methods suggested in this document, and anyone using these methods does so at his/her own risk.

SAP offers no guarantees and assumes no responsibility or liability of any type with respect to the content of this technical article or code sample, including any liability resulting from incompatibility between the content within this document and the materials and services offered by SAP. You agree that you will not hold, or seek to hold, SAP responsible or liable with respect to the content of this document.