Transcript
Page 1: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Power Big Data Analytics with Informatica Cloud Integration

Ron Lunasin, Informatica Cloud Product Management

Ajay Gandhi, Informatica Cloud Product Marketing

Alan Lundberg, Informatica Vibe Data Stream Marketing

March 2014

Page 2: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Today’s Agenda

• Informatica Cloud Integration and Data Management

• Common Customer Use Cases

• Informatica for AWS DB and Big Data Services

• Informatica Cloud + Redshift Demonstration

• Informatica Vibe Data Stream for Kinesis

• Next Steps and Q&A

2

Page 3: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Leader in Cloud and Hybrid IT Integration

3

Gartner MQ for Integration Platform as a Service, Jan 2014

The Forrester Wave: Hybrid2

Integration, Q1 2014

Page 4: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Hybrid IT Architecture is the New Normal

Page 5: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Informatica Cloud PlatformCloud Integration and Data Management

Cloud Data

Quality

Cloud Master

Data

Management

Cloud

Process

Automation

Cleanse and

De-Dupe

Visualize

RelationshipsImprove User

Experience

Leverage

Existing Systems

Cloud

Integration

Secure

Sandbox

Cloud Test

Data

Management

You Need More Than Just Integration

Page 6: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Hundreds of Connectors

Page 7: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

• 100% Cloud

• Developer and App User

Collaboration

• Productivity for Advanced

Integration Use Cases

• Vibe Integration Packages

Visual Productivity for Advanced Cloud and Hybrid IntegrationInformatica Cloud Designer

Page 8: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Introducing Vibe Integration Packages (VIP’s)Redefining Hybrid IT Integration Agility

VIPs=Pre-built parameterized integration workflows

Built by developers for app users and other developers/partners

App users configure VIPs using wizards to build custom integrations

VIPs work with Cloud and PowerCenter

VIPs can be distributed via Informatica Marketplace

VIPs are easily embedded into 3rd-party apps via APIs

Page 9: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Today’s Agenda

• Informatica Cloud Integration and Data Management

• Common Customer Use Cases

• Informatica for AWS DB and Big Data Services

• Informatica Cloud + Redshift Demonstration

• Informatica Vibe Data Stream for Kinesis

• Next Steps and Q&A

9

Page 10: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Common Customer Use Cases

• Reduce costs by extending

DW rather than adding HW

• Migrate completely from

existing DW systems

• Respond faster to

business; provision in

minutes

• Improve performance by an

order of magnitude

• Make more data available

for analysis

• Access business data via

standard reporting tools

• Add analytic functionality to

applications

• Scale DW capacity as

demand grows

• Reduce HW & SW costs by

an order of magnitude

Traditional Enterprise DW Companies with Big Data SaaS Companies

Page 11: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Using the Cloud Isn’t an “All or Nothing” Choice

11

Page 12: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Integrating AWS With Existing On-Premises IT

12

Page 13: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Use Cloud To Make On-Premises Apps Better

13

Backup

Analytics

Page 14: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Cloud Apps That Integrate With On-Premises Apps

14

AWS serves

application

content & data

Integration to data

centers for

financial

transactions

Page 15: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Today’s Agenda

• Informatica Cloud Integration and Data Management

• Common Customer Use Cases

• Informatica for AWS DB and Big Data Services

• Informatica Cloud + Redshift Demonstration

• Informatica Vibe Data Stream for Kinesis

• Next Steps and Q&A

15

Page 16: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Support for AWS Database and Big Data Services

16

Pre-built Cloud & PowerCenter

Connectors for RDS and Redshift

Vibe Data Streaming

for Kinesis

InformaticaCloud.com/Amazon-Redshift

Page 17: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

2

Informatica Cloud Architecture Overview- Redshift

4Secure

Agent

Your Company or VPC

Amazon

Redshift

31

Amazon

RDS

Page 18: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Map Once. Deploy Anywhere.

ON PREMISE HADOOP 3rd PARTY

APPLICATIONS

CLOUD

Page 19: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Today’s Agenda

• Informatica Cloud Integration and Data Management

• Common Customer Use Cases

• Informatica for AWS DB and Big Data Services

• Informatica Cloud + Redshift Demonstration

• Informatica Vibe Data Stream for Kinesis

• Next Steps and Q&A

19

Page 20: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

©2013 Informatica. Proprietary and Confidential 20

Redshift Upsert – Manual Coding Way

Extract the data from source1

Put into flat files and compress2

3 Transfer Compressed Files To S3

4 Wait for S3 Consistency

5Copy Data From S3 Into Staging Table6Inner Join With Target Table To Delete Rows To Be Updated

Insert Updated Rows From Staging Table

7

Delete Staging Table

8

9

Delete Files From S3

Create Staging Table in Redshift

10

Or, Do It In 3 Simple Steps…

Page 21: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Redshift Upsert – Informatica Cloud Way

1

2

3

Choose Upsert Operation

Map Your Fields

Run Or Schedule!

Page 22: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Informatica Cloud Amazon Redshift Architecture

Firewall

Informatica Cloud Secure Agent

Metadata Mappings

Build mapping and execute job

1

1Retrieve Account Data2

2

3 Put Account Data into Flat File

4 Transfer compressed Flat File to S3

5 Initiate copy from S3

6 Load data into Amazon Redshift

6

3

54

Page 23: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

DEMO!

23

Page 24: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Today’s Agenda

• Informatica Cloud Integration and Data Management

• Common Customer Use Cases

• Informatica for AWS DB and Big Data Services

• Informatica Cloud + Redshift Demonstration

• Informatica Vibe Data Stream for Kinesis

• Next Steps and Q&A

24

Page 25: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS
Page 26: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

2626

Data / Sensor Diversity…

Page 27: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

27

How to make sense of it all…

Page 28: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Streaming Collection: Vibe Data Stream (VDS)

28

Vib

e D

ata

Str

ea

m B

us

Pu

blis

h /

Su

bscrib

e

Leverage High Performance

Messaging Infrastructure.

Publish with Ultra Messaging

for global distribution without

additional staging or landing.

Cloudera,

Pivotal,

Hortonworks,

MapR

Targets

Web Servers,

Operations

Monitors, rsyslog,

SLF4J, etc.

Handhelds, Smart

Meters, etc.

Discrete Data

Messages

Sources

VDS

Node

VDS

Node

VDS

Node

VDS

Node

VDS

Node

Management

and Monitoring

Internet of Things,

Sensor Data

VDS

Node

Real Time

Analysis, Stream

Processing

No SQL

Databases:

HBASE,

Cassandara,

Riak, MongoDB

Page 29: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

29

Transactions,

OLTP, OLAP

Social Media, Web Logs

Machine Device, Scientific

Documents and Emails

Vibe Data

Stream

Vibe Data

Stream

Vibe Data

Stream

Page 30: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

AWS Kinesis + Informatica – Framework for Deeper Insight

30

Level 2Reduce time-to-information & time-to-decision

Operational pattern matching, alerts, Real-time analytics

Level 3 Create Visibility & Insight to Understand the Business Impact

Operational KPIs, Alignment of IT & Business, Drill down

Service Delivery Applications

OSS / BSS Applications

Network Applications

Level 1Instrument for Problems & Opportunities

Detection, response, correlation & extrapolation of trends

Vibe Data

Stream

Page 31: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Use Cases – Solving the Difficult Problems

31

Detect

Patterns

Exception

Monitoring

Process

Monitoring

• Deviations from norm

(Monitoring, Fraud, Error)

• Trending up/down to exceed

a threshold

• SLA monitoring

• 3 events within 5 milliseconds

• A then B then C occurs

• Geospatial processing

• Are process workflows

operating properly?

• Are manual processes

completed on time?

• Detect Missing Work and

Queued Work

Page 32: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Informatica in “Lambda Architectures”

Adapted from “Runaway Complexity in Big Data”, Nathan Marz, Sept. 25/2012

Transactions, OLTP, OLAP

Social Media, Web Logs

Machine Device, Scientific

Documents and Emails

Batch LayerBatch

View

Big Data Analytics + Real Time Streams

Speed LayerReal Time

View

Serving LayerMerged

View

• Stream Processing

• Filter / Classify

• Correlate

Page 33: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

• Solution approach that complements and augments

traditional BI and reporting

• Combines approaches and techniques from various

technology areas, including:

• End-to-end and comprehensive data Integration

• Event processing and event-driven architectures

• Rapid data provisioning via a common data access layer

• Access to LIVE data in operational systems

• Access to all types of data including unstructured data

AWS Kinesis + Informatica = “Real Time Operational Intelligence”

Sense

Reason

Respond

Visualize

Page 34: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Architectural Implications

Batch processing

Data structured, homogenous High Volume and variety

Distributed SystemsCentralized Database-centric

Client Server Systems

Prioritize Modeling

events as enterprise

objects / assets

Real Time

Yesterday Today

Events treated as

2nd class citizens

Page 35: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Informatica User Interactions

Developers / OEMs

Developer IDE

TemplatesAnalyst

MyRulePoint Portal

SDKs

User Tool

Business User

Page 36: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

36

Streaming Collection: Vibe Data Stream Dev Benefits

• Central Monitoring Console

for Deployment

• Fault Tolerant

• High Availability

• Vertical &

Horizontal

Scaling

• Ease of Configuration

Page 37: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Streaming Collection: Topology View

37

Page 38: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Informatica on Amazon Kinesis

• Solving tough infrastructure problems..

• .. So you stay focused on solving tough business problems..

• Coming soon on Amazon..

• Stay tuned..

• Drop an email with your use cases, needs to me at

[email protected]

38

Page 39: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Today’s Agenda

• Informatica Cloud Integration and Data Management

• Common Customer Use Cases

• Informatica for AWS DB and Big Data Services

• Informatica Cloud + Redshift Demonstration

• Informatica Vibe Data Stream for Kinesis

• Next Steps and Q&A

39

Page 40: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Next Steps

• Visit us at Booth# 107 to see more demos!

• Get started with Informatica Cloud

• InformaticaCloud.com

• Learn more about our Redshift Connector

• InformaticaCloud.com/Amazon-Redshift

40

Page 41: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Q & A

41

• Ron Lunasin, Informatica Cloud Product Management

• Ajay Gandhi, Informatica Cloud Product Marketing

• Alan Lundberg, Informatica Vibe Data Stream Marketing

@infacloudInformaticaCloud.com

Page 42: Power Big Data Analytics with Informatica Cloud Integration for Redshift, Kinesis and RDS

Recommended