12
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 1 The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho

Pentaho Big Data Analytics with Vertica and Hadoop

Embed Size (px)

DESCRIPTION

Overview of the Pentaho Big Data Analytics Suite from the Pentaho + Vertica presentation at Big Data Techcon 2014 in Boston for the session called "The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho"

Citation preview

Page 1: Pentaho Big Data Analytics with Vertica and Hadoop

The Ultimate Selfie | Picture Yourself with the Fastest Analytics on Hadoop with HP Vertica and Pentaho

Page 2: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75552

The Ultimate Selfie Picture Yourself with the Fastest Analytics on Hadoop

with HP Vertica and Pentaho

Pentaho Big Data Analytics

Mark KromerPentaho Big Data Analytics Product Manager

Page 3: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75553

DBA ETL/BI Developer Business Users & Executives

Analysts & Data Scientists

OPERATIONAL DATA BIG DATA DATA STREAMPUBLIC/PRIVATE CLOUDS

Enterprise & Interactive Reporting

Interactive Analysis

Dashboards Predictive Analytics

Pentaho Business Analytics

Data IntegrationInstaview | Visual Map Reduce

DIRECT ACCESS

Pentaho Business Analytics Platform

Page 4: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75554

Product Components

Pentaho Data Integration

• Visual development for big data• Broad connectivity• Data quality & enrichment• Integrated scheduling• Security integration

• Visual data exploration• Ad hoc analysis• Interactive charts & visualizations

Pentaho Dashboards

• Self-service dashboard builder• Content linking & drill through• Highly customized mash-ups

Pentaho Data Mining & Predictive Analytics

• Model construction & evaluation • Learning schemes• Integration with 3rd part models

using PMML

Pentaho Enterprise & Interactive Reports

• Both ad hoc & distributed reporting• Drag & drop interactive reporting• Pixel-perfect enterprise reports

Pentaho for Big Data MapReduce & Instaview

• Visual Interface for Developing MR

• Self-service big data discovery• Big data access to Data Analysts

Pentaho Analyzer

Page 5: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75555

❯ Simple, easy-to-use visual data exploration

❯ Web-based thin client; in-memory caching

❯ Rich library of interactive visualizations • Geo-mapping, heat grids, scatter plots, bubble

charts, line over bar and more• Pluggable visualizations

❯ Java ROLAP engine to analyze structured and unstructured data, with SQL dialects for querying data from RDBMs

❯ Pluggable cache integrating with leading caching architectures: Infinispan (JBoss Data Grid) & Memcached

Pentaho Interactive Analysis & Data DiscoveryHighly Flexible Advanced Visualizations

Page 6: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75556

Pentaho Data Integration

Easy to Use, Highly Scalable

❯Graphical ETL designer

❯Data agnostic

• Structured, unstructured, web services, packaged

apps (Google, SAS, SFDC, etc.), big data sources,

traditional sources, JSON, XML, HL7, etc.

❯Batch, low-latency & real time processing

❯Scale-out architecture, deployable to PDI clusters,

Hadoop clusters

❯100% Java engine; plug-in architecture for extensibility

❯Workflow, alerting, monitoring

Integration, Manipulation & Enrichment

Use Cases:

Classic ETL – data warehouse creation, population & maintenance

Information Delivery – extraction from multiple data sources,

transformation and streaming to a report

MapReduce Applications – implementing “code-free”

transformation pipelines within Hadoop

Extensibility – adding 3rd-party functionality that automatically

works within any of the above use cases.

Page 7: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75557

Pentaho Big Data Analytics Accelerate the time to big data value

• Full continuity from data

access to decisions –

complete data integration &

analytics for any big data

store

• Faster development,

faster runtime – visual

development, distributed

execution

• Instant and interactive

analysis – no coding and

no ETL required

Page 8: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75558

Pentaho Visual DevelopmentEliminates the Need for Complex Coding

Would you rather do this?

Scheduling Modeling

Ingestion / Manipulation / Integration

… or this?

Page 9: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75559

Pentaho Visual MapReduceDrag & Drop, Then Run in the Cluster

Parallel Execution as MapReduce in the Hadoop Cluster

As Much as 15x Faster Than Hand-Written Code

Page 10: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755510

• Major sponsor of the open source project Weka

• Data exploration/visualization, model construction and

export, preliminary evaluation

• Numerous classification/regression and clustering

algorithms

• Integration with Pentaho Data Integration

❯ Import 3rd-party models using Predictive Modeling

Markup Language (PMML)

❯ Operationalize models inside or outside of a Hadoop

Cluster

❯ Incorporate algorithms into Pentaho visual interface;

store and version models using the Pentaho repository

Pentaho Predictive Analytics

Full Predictive Analytics Lifecycle Support

Page 11: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755511

Streamlined Data RefineryDrive a Sustainable Analytics Strategy with Big Data Orchestration at Scale

Transactions – Batch & Real-time

Enrollments & Redemptions

Location, Email, Other Data

Hadoop Cluster

Analyzer

Reports

Data Orchestration

Page 12: Pentaho Big Data Analytics with Vertica and Hadoop

© 2014, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755512

blog.pentaho.com

@Pentaho

Facebook.com/Pentaho

Pentaho Business Analytics

JOIN THE CONVERSATION. YOU CAN FIND US ON: