14
©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential www.immixgroup.com www.immixgroup.com ©2014 immixGroup, Inc. All rights reserved. Business Analytics for SAP ERP Data A working solution for big data. Asim Aziz [email protected] 443-327-9727 Jay Colavita [email protected] 703-639-1552

IBMInsightVegas2014

Embed Size (px)

Citation preview

Page 1: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

www.immixgroup.com ©2014 immixGroup, Inc. All rights reserved.

Business Analytics for SAP ERP Data

A working solution for big data.

Asim Aziz [email protected] 443-327-9727 Jay Colavita [email protected] 703-639-1552

Page 2: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Businesses run their operations on SAP

Many of the world’s largest and most sophisticated organizations use SAP to run their operations. And SAP does a great job!

But now these organizations have an additional requirement: they need to analyze the data that SAP’s been collecting.

They need to get business intelligence (BI) from their SAP data.

Page 3: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Consider HANA

3

SAP offers a BI solution, based on their new HANA technology.

Many SAP customers are finding that HANA does not presently fit their needs.

Page 4: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

The US Navy also needs BI from SAP

4

•  The US Navy is a large SAP customer, and it needs a BI solution for its SAP data, for the same reasons commercial customers do.

•  Naval Supply Systems Command (NAVSUP) led the Navy’s effort to integrate multiple commands’ SAP data into a consolidated warehouse.

•  NAVSUP requirements: huge data volumes, scalability, low impact on operations, support of multiple BI platforms, thousands of users.

Page 5: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Business Analytics for SAP ERP Data

5

•  NAVSUP considered a number of competing alternatives, performed extensive tests, and ran POCs for about one year. In the end they chose Business Analytics for SAP ERP Data.

•  A typical configuration is shown above. This system is based largely on IBM hardware, particularly IBM Pure Data for Analytics (formerly Netezza), and software from IBM and Boston Common Analytics.

Page 6: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

•  Non-SAP sources are usually straightforward (from CDC point of view). •  Almost all required SAP tables are transparent, i.e. directly represented as

Oracle tables. •  Main consideration: minimize impact on ongoing operations. •  InfoSphere CDC works by processing database transaction logs. •  Tests showed impact on SAP operations is less than 3%

Change Data Capture

Page 7: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com 7

•  Most, but not all, of the source tables are transparent SAP tables. But what about cluster tables?

•  InfoSphere DataStage includes SAP pack, which contains efficient tools for extracting data from cluster tables.

•  In general, DataStage has large libraries of methods for extracting data from heterogeneous sources.

Cluster tables

Page 8: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com 8

Cluster tables

•  Following extraction from sources, data is loaded to a staging area in Netezza by DataStage.

•  The main consideration here is handling massive data volumes (multiple terabytes) quickly, especially during initial loads or refreshes.

•  DataStage performed well during stress testing on terabyte-range dataloads.

Page 9: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Security Model

•  Requirement: maintain and enhance security of legacy system in new, more flexible environment.

•  PKI authentications on BI tools assure that people are who they say they are.

•  Certifications on BI tools correspond to privileges on Netezza. Netezza will only respond to BI tools’ requests with data a user is authorized to view.

•  Netezza row-based-security, ELTMaestro, and custom programming are used to implement and preserve security of data through transformations.

Page 10: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Row- and column-based security

KEY COMMAND MATERIAL_NUMBER FUND UNIT_PRICE INVOICE CMD_NAME QUANTITY

101 1782 11961958 97XBP28 11.69 23.38 NAVSUP 2

102 1782 13316115 97XBP29 9.06 18.12 NAVSUP 2

103 1719 12767391 97XBP31 0.20 40.00 NAVAIR 200

104 1719 12153775 97XBP31 35.88 2152.80 NAVAIR 60

Users authorized to view NAVSUP data only see these rows.

Users authorized to view NAVAIR data only see these rows.

Columns containing prices are hidden from contract employees.

Page 11: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Different reasons to secure data

Reasons to restrict data access include: •  Origin: Commands many not want

other commands to view their data. •  Data type: For example, contractors

should not see dollar amounts. •  Values in particular fields: For

example, a data row may be restricted based on plant number or company code.

Page 12: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Guardium Active Security

•  The system is overseen by an active security system called InfoSphere Guardium.

•  Guardium detects suspicious activity even by authorized users: e.g. large downloads, access at unusual times, repeated access, etc.,

12

•  Guardium can respond with a range of actions such as logging the activity, sending an email, or actively closing a session.

Page 13: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Why is getting BI from SAP data so hard?

13

1. Data Warehousing is hard. 2.  SAP data is large. 3.  SAP tables aren’t relational

tables. 4.  Moving data from SAP to the

outside world is arduous and slow.

5.  No one knows how cluster tables are represented in the underlying database.

6.  Operations on the underlying database can interfere with critical business operations.

7.  The underlying database contains tens of thousands of tables.

Page 14: IBMInsightVegas2014

©2014 immixGroup, Inc. All rights reserved. No part of this presentation may be reproduced or distributed without the prior written permission of immixGroup, Inc. All trademarks are the property of their respective owners. immixGroup Confidential

www.immixgroup.com

Keep calm, and visualize the kitty.

14

1.  Yes, data warehousing is hard, but we’ve been doing it for a long time. We know our way around.

2.  Our system can absolutely handle your data volumes. 3.  SAP tables aren’t relational tables, we know. We have a big bag of tools to

deal with that. It’s what we do. 4.  Extracting SAP data is slow? Not when we do it. 5.  We can extract SAP data no matter what kind of table it is. DataStage has

multiple ways of extracting data from SAP – including automatically writing ABAP programs which feed DataStage jobs.

6.  InfoSphere CDC does not impact ongoing operations. We know this from rigorous field testing.

7.  Yes, there are ~80K tables in the underlying DB. But the ones that you need for your reporting number in the hundreds. We can find them.