54

Oracle Big Data y Database Analytics - Jordi Trill

Embed Size (px)

DESCRIPTION

Oracle proporciona una solución completa y abierta, sencilla de implementar, que combina hardware y software, para incorporar entornos y arquitecturas Big Data en entornos IT empresariales que requieran elevados niveles de fiabilidad, seguridad y productividad. Con Oracle Big Data SQL es posible mantener múltiples repositorios de información -Hadoop, NoSQL y relacionales- y acceder a ellos de forma unificada mediante SQL con el máximo rendimiento y el mínimo movimiento de información.

Citation preview

Page 1: Oracle Big Data y Database Analytics - Jordi Trill
Page 2: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data y Database Analytics en el ámbito empresarial

Jordi Trill Core Tech Business Development Manager

Page 3: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

The World has Changed!

Page 4: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Five Forces Are Transforming IT

The convergence of big data, social, mobile, IoT and cloud computing – five distinct, yet increasingly intertwined technology trends that exist in an overlapping matrix, where the importance of each increases because it leverages one of the others

Social Mobile

Cloud Big Data

IoT

Page 5: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

The use of any digital technology to promote, sell and enable innovative products, services and experiences

What is Digital Business?

Page 6: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

How Are Companies Using Digital?

Customer

Experience Operational

Improvement

New Business

Models

44% 30% 26%

Big Data Cloud IOT Social Mobile

Page 7: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Digital Business Strategies

Digital Transformation

Digital Disruption

Page 8: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

The change that occurs when new digital technologies and/or business models affect the value proposition of existing goods and services

Digital Disruption

Page 9: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

The re-alignment of, or new investment in, technology and/or business models to more effectively engage consumers or employees

Digital Transformation

Page 10: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Page 11: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Thoughts Things Activities

Big Data Is The Datafication Of Everything

Page 12: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Ability to Consume Data

12%

Executives who feel they understand the impact data

will have on their organizations

Ability to Produce Data

Challenge #1: Data Production Outweighs Consumption

Source: The Economist Intelligence Unit

Page 13: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Challenge #2: Data Analysis Takes Too Long

Source: Richard Hackethorn’s Component’s of Action Time

Business event

Response Time

Bu

sin

ess V

alu

e

Data captured

Analysis completed

Action taken

of executives say too much critical

information is delivered too late

Page 14: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Data Warehouse 2.0 Data Warehousing in the Age of Big Data

1. Still to many data marts

2. Batch updates = stale data sets

3. Instinct-based decision making

4. Analysis bolted onto limited set of business

processes

Data Warehouse 1.0

1. Integrated, consolidated architecture

2. Real-time ELT = data always fresh

3. Fact-based decision making

4. Analysts focus on discovery and driving business

value

Data Warehouse 2.0

The Path to Monetizing Big Data

Source: Tom Davenport – Harvard Business Review

Page 15: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

00101110001

01011100010

10111000100

00101110001

01011100010

10111000100 Hadoop Relational

DBMS

Integrated Data Management Platform

Page 16: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

30 years of development

High concurrency

Rich language

Tool support

Security

Mission-critical performance

Relational Hadoop

Innovative, economical and flexible

Scale out simply

Cost effective

Rapidly evolving

Less formalization

Oracle Solution: Big Data Management System

+

BDMS

Integrated Data Management Platform

Page 17: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Ken Rudin, Director of Analytics (Facebook)

Original from James Collins, Stanford University

“The genius of 'and',

the tyranny of 'or' ”

http://tdwi.org/Articles/2013/05/06/Facebooks-Relational-Platform.aspx?Page=1

Page 18: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Big Data Applications

Business Analytics

Big Data Management System

Data Warehouse Data Reservoir +

Discovery Biz Intelligence +

by Industry & LoB

Key Requirements

Single view of ALL data

Optimized performance

across data sets

Continuous, enterprise-

grade functionality

Security,

Resource

Management,

Backup and DR

Page 19: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Management System

OLTP/Data Warehouse Data Reservoir +

Oracle Big Data Connectors

Oracle Data Integrator

Oracle Advanced Analytics

Oracle

Database

Oracle Spatial & Graph

Oracle NoSQL Database

Cloudera Hadoop

Oracle R Distribution

Oracle Industry Models

Oracle GoldenGate

Oracle Data Integrator

Oracle Event Processing

Oracle Event Processing

Apache Flume

Oracle GoldenGate

Oracle Advanced Analytics

Oracle Database

Oracle Spatial & Graph

Oracle In-Memory Columnar Store

Oracle Big Data SQL

Page 20: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Appliance

Hardware (X4-2 Full ; 18 nodes)

288 CPU cores with 1152 GB RAM

864 TB of raw disk storage

40 Gb/s InfiniBand

Integrated Software (Pre-installed, pre-optimized)

Includes all components of Cloudera Enterprise and Add-ons

Cloudera CDH

Cloudera Impala

Cloudera HBase (with Apache Accumulo)

Cloudera Search

Apache Spark

Cloudera Manager (incl. BDR and Navigator)

Oracle NoSQL Database

Oracle R Distribution

Oracle Confidential – Internal/Restricted/Highly Restricted 23

Page 21: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | 26

Oracle Big Data Appliance Installation

Physical Installation (10 racks)

Electricians

Network Engineers

Storage Engineers

System Administrators

286 hours 236 hours, 616 cables

264 hours, 864 cables

320 hours, 576 cables

232 hours

16 hours 16 hours, 32 cables

6 hours, 14 cables

n/a n/a

38 vs. 1338 hours 19 vs. 677 elapsed hours 46 vs. 2344 cables

vs.

Oracle

Custom

Page 22: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Enterprise Security

Authentication through Kerberos

Authorization through Apache Sentry

Auditing through Oracle Audit Vault

Encryption for Data-at-Rest

Network Encryption

Page 23: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle NoSQL Database

30

101100101001001001101010101011100101010100100101

Reliable Flexible Fast Simple

advanced Key-Value database designed as cost effective, high performance solution for simple operations on collections of data with built in high availability and elastic

scale-out.

less is more

Page 24: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Open source language and environment

Used for statistical computing and graphics

Strength in easily producing publication-quality graphs

Highly extensible

Created by Robert Gentleman and Ross Ihaka.

33

Big Data Technology Today R Statistical Programming Language

Page 25: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Management System

OLTP/Data Warehouse Data Reservoir +

Oracle Big Data Connectors

Oracle Data Integrator

Oracle Advanced Analytics

Oracle

Database

Oracle Spatial & Graph

Oracle NoSQL Database

Cloudera Hadoop

Oracle R Distribution

Oracle Industry Models

Oracle GoldenGate

Oracle Data Integrator

Oracle Event Processing

Oracle Event Processing

Apache Flume

Oracle GoldenGate

Oracle Advanced Analytics

Oracle Database

Oracle Spatial & Graph

Oracle In-Memory Columnar Store

Oracle Big Data SQL

Page 26: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

JSON Native

In-Database Open R

In-Database Map Reduce

OLAP Engine

SQL Pattern Matching

Data Redaction

Adaptive Execution Plans

XML Native

Spatial & Graph Analysis

In-Database Analytics

Page 27: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Good SQL execution without intervention

HJ

Table scan T2

Table scan T1

NL

Index Scan T2

Threshold exceeded, plan switches

Table scan T1

HJ

Table scan T2

Plan decision deferred until

runtime Final decision is based on

statistics collected during execution If statistics prove to be out of

range, sub-plans can be swapped Bad effects of skew eliminated &

queries significantly accelerated

Query Performance Acceleration

Adaptive Execution Plans

Page 28: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Pattern Matching

• Scalable discovery of business event sequences,

– Clickstream Analysis

– Fraud Detection

– Stock Analysis

– Dropped Calls Analysis

– Automatic Medical Detections

• Historically requires complex SQL or external code to execute

EVENT TIME LOCATION

A 1 SFO

A 1 SFO

A 2 ATL

A 2 LAX

B 2 SFO

C 2 LAX

C 3 LAS

A 3 SFO

B 3 NYC

C 4 NYC

> 1

min

A 2 ATL

A 2 LAX

B 2 SFO

C 2 LAX

“Find one or more event A followed by one B

followed by one or more C in a 1 minute interval”

Recognize patterns in sequences of rows

Page 29: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

SQL Pattern Matching in action Example: Find a double bottom pattern (W-shape) in ticker stream

• Find a W-shape pattern in a ticker stream:

• Output the beginning and ending date of the pattern

• Calculate average price each the W-shape

• Find only patterns that lasted less than a week days

Stock price

Page 30: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

SELECT . . . FROM ticker MATCH_RECOGNIZE ( . . . )

days

Stock price

SQL Pattern Matching in action Example: Find W-Shape

New syntax for discovering patterns using SQL:

MATCH_RECOGNIZE ( )

Page 31: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Find a W-shape pattern in a ticker stream:

• Set the PARTITION BY and ORDER BY clauses

SELECT … FROM ticker MATCH_RECOGNIZE ( PARTITION BY name ORDER BY time

days

Stock price

SQL Pattern Matching in action Example: Find W-Shape

Page 32: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Find a W-shape pattern in a ticker stream:

• Define the pattern – the “W-shape”

SQL Pattern Matching in action Example: Find W-Shape

SELECT … FROM ticker MATCH_RECOGNIZE ( PARTITION BY name ORDER BY time PATTERN (X+ Y+ W+ Z+)

days

Stock price

Page 33: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Find a W-shape pattern in a ticker stream:

• Define the pattern – the “W-shape”

SQL Pattern Matching in action Example: Find W-Shape

days

Stock price

SELECT … FROM ticker MATCH_RECOGNIZE ( PARTITION BY name ORDER BY time PATTERN (X+ Y+ W+ Z+) DEFINE X AS (price < PREV(price)), Y AS (price > PREV(price)), W AS (price < PREV(price)), Z AS (price > PREV(price)))

X Y W Z

Page 34: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Find a W-shape pattern in a ticker stream:

• Define the measures to output once a pattern is matched:

• FIRST: beginning date

• LAST: ending date

SQL Pattern Matching in action Example: Find W-Shape

days

Stock price

SELECT … FROM ticker MATCH_RECOGNIZE ( PARTITION BY name ORDER BY time

MEASURES FIRST(x.time) AS first_x, LAST(z.time) AS last_z

PATTERN (X+ Y+ W+ Z+) DEFINE X AS (price < PREV(price)), Y AS (price > PREV(price)), W AS (price < PREV(price)), Z AS (price > PREV(price)))

X Z

Page 35: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Find a W-shape pattern in a ticker stream:

• Calculate average price in the second ascent

SQL Pattern Matching in action Example: Find W-Shape

1 9 13 19 days

Stock price

SELECT first_x, last_z, avg_price

FROM ticker MATCH_RECOGNIZE (

PARTITION BY name ORDER BY time

MEASURES FIRST(x.time) AS first_x,

LAST(z.time) AS last_z,

AVG(z.price) AS avg_price

ONE ROW PER MATCH

PATTERN (X+ Y+ W+ Z+)

DEFINE X AS (price < PREV(price)),

Y AS (price > PREV(price)),

W AS (price < PREV(price)),

Z AS (price > PREV(price) AND

z.time - FIRST(x.time) <= 7 ))))

Average stock price: $52.00

Page 36: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

next = lineNext.getQuantity();

}

if (!q.isEmpty() && (prev.isEmpty() || (eq(q, prev) && gt(q, next)))) {

state = "S";

return state;

}

if (gt(q, prev) && gt(q, next)) {

state = "T";

return state;

}

if (lt(q, prev) && lt(q, next)) {

state = "B";

return state;

}

if (!q.isEmpty() && (next.isEmpty() || (gt(q, prev) && eq(q, next)))) {

state = "E";

return state;

}

if (q.isEmpty() || eq(q, prev)) {

state = "F";

return state;

}

return state;

}

private boolean eq(String a, String b) {

if (a.isEmpty() || b.isEmpty()) {

return false;

}

return a.equals(b);

}

private boolean gt(String a, String b) {

if (a.isEmpty() || b.isEmpty()) {

return false;

}

return Double.parseDouble(a) > Double.parseDouble(b);

}

private boolean lt(String a, String b) {

if (a.isEmpty() || b.isEmpty()) {

return false;

}

return Double.parseDouble(a) < Double.parseDouble(b);

}

public String getState() {

return this.state;

}

}

BagFactory bagFactory = BagFactory.getInstance();

@Override

public Tuple exec(Tuple input) throws IOException {

long c = 0;

String line = "";

String pbkey = "";

V0Line nextLine;

V0Line thisLine;

V0Line processLine;

V0Line evalLine = null;

V0Line prevLine;

boolean noMoreValues = false;

String matchList = "";

ArrayList<V0Line> lineFifo = new ArrayList<V0Line>();

boolean finished = false;

DataBag output = bagFactory.newDefaultBag();

if (input == null) {

return null;

}

if (input.size() == 0) {

return null;

}

Object o = input.get(0);

if (o == null) {

return null;

}

//Object o = input.get(0);

if (!(o instanceof DataBag)) {

int errCode = 2114;

String msg = "Expected input to be DataBag, but"

SELECT first_x, last_z

FROM ticker MATCH_RECOGNIZE (

PARTITION BY name ORDER BY time

MEASURES FIRST(x.time) AS first_x,

LAST(z.time) AS last_z

ONE ROW PER MATCH

PATTERN (X+ Y+ W+ Z+)

DEFINE X AS (price < PREV(price)),

Y AS (price > PREV(price)),

W AS (price < PREV(price)),

Z AS (price > PREV(price) AND

z.time - FIRST(x.time) <= 7 ))

250+ Lines of Java and PIG 12 Lines of SQL

20x less code, 5x faster

SQL Pattern Matching Finding Double Bottom (W)

Page 37: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Native Support for TOP-N Queries

• New OFFSET and FETCH FIRST clauses

• Specify number or percentage of rows to return

• ANSI 2008/2011 compliant with additional extensions

Simplified Code Development

“Who are the top 5 money makers in my enterprise?”

SELECT empno, ename, deptno

FROM emp

ORDER BY sal, comm FETCH FIRST 5 ROWS ONLY;

SELECT empno, ename, deptno

FROM (SELECT empno, ename, deptno, sal, comm,

row_number() OVER (ORDER BY sal,comm) rn

FROM emp

)

WHERE rn <=5

ORDER BY sal, comm;

versus

Page 38: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle OLAP Built-in Access to Analytic Calculations

Multidimensional analytic engine that analyzes summary data

Offers improved query performance and fast, incremental updates

Embedded in Oracle Database instance and storage

Example Analytical Questions How do sales in the Western region this quarter

compare with sales a year ago? What will sales next quarter be? What factors can we alter to improve the sales

forecast?

Page 39: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

R code and/or SQL

Models run in-database

Avoid Data Movement

Large data sets

Built-in security

Oracle Advanced Analytics R Enterprise and Data Mining

Page 40: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Spatial and Graph High performance, simplified geospatial analysis all through SQL

ROADS

RNAME ID TYPE LANES GEOMETRY M40

M25

140

141

HWY

HWY

6

4

SELECT a.owner_name, a.acquisition_status FROM properties a, projects b WHERE sdo_within_distance (a.property_geom, b.project_geom, ‘distance = .1 unit = mile’) = ‘TRUE’ and b.project_id=189498;

Vector Performance Acceleration: 50X performance improvement (2D/3D queries) - Spatial joins, touch, contains, overlaps, complex masks

Page 41: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Spatial and Graph: Network Graph

Explicitly stores and maintains connectivity

Attributes at link and node level

Example algorithms: Traveling salesman, spanning tree, shortest path, sub-path, within cost, nearest neighbors

Very fast network analysis

Graph model to represent physical and logical networks

Page 42: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Spatial and Graph: Semantic Graph

• Supports all relevant W3C standards

• View relational data as RDF graph

• 60% data compression reduces storage and enhances performance

RDF Semantic Graph

Page 43: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

In-Database MapReduce

Oracle Database

Reduce

Table

Map

Map

Reduce

Table K V

timestamp userid pageid

10:00:00 12345 A73_2

10:00:02 8901 A74_3

10:00:03 12345 A73_3

10:01:12 12345 A74_4

session userid pageid duration

0 12345 A73_2 3

0 12345 A73_3 70

0 12345 A74_4 12

1 8901 A74_3 89

MapReduce within the Oracle Database:

select session, userid, pageid, duration

from table(oracle_map_reduce.reducer(cursor(

select * from table(oracle_map_reduce.mapper(cursor(

select * from clicks))) map_result)));

=> Works on internal and external data sources

=> Leverage PL/SQL skills for big data analytics

=> High efficiency through parallel pipelined infrastructure

=> In-database execution allows for fast query performance

Page 44: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Soc. Sec. # 115-69-3428

DOB 11/06/71

NAME SARA JONES

Policy enforced redaction of sensitive data

Data Redaction

Data Analyst

ETL / Data Quality Processes

Dynamically Masking for Data Warehouses

Page 45: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Management System

OLTP/Data Warehouse Data Reservoir +

Oracle Big Data Connectors

Oracle Data Integrator

Oracle Advanced Analytics

Oracle

Database

Oracle Spatial & Graph

Oracle NoSQL Database

Cloudera Hadoop

Oracle R Distribution

Oracle Industry Models

Oracle GoldenGate

Oracle Data Integrator

Oracle Event Processing

Oracle Event Processing

Apache Flume

Oracle GoldenGate

Oracle Advanced Analytics

Oracle Database

Oracle Spatial & Graph

Oracle In-Memory Columnar Store

Oracle Big Data SQL

Page 46: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Connectors

Data Load Oracle Loader for Hadoop

Data Access Oracle SQL Connector for

HDFS

R Analytics Oracle R Advanced Analytics

on Hadoop

Oracle Data Integrator Knowledge Modules

XML/XQuery Oracle XQuery on Hadoop

XQuery R Client

Optimized for Hadoop Maximum parallelism

Fast performance Analyze all your data in-place

Page 47: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data Connectors and Data Integrator

Big Data Appliance +

Hadoop

Exadata +

Oracle Database

15TB / hour

10x Faster

Page 48: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

When Data Lives in Many Places…

Oracle Confidential – Internal/Restricted/Highly Restricted 58

Profit and Loss

Relational Hadoop

Application Logs

NoSQL

Customer Profiles

SQL

Page 49: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data SQL

Query All Data without Application Change or Data Conversion

Big Data Appliance +

Cloudera Hadoop

Oracle NoSQL DB

Exadata +

Oracle Database

Oracle Catalog

External Table

create table customer_address

( ca_customer_id number(10,0)

, ca_street_number char(10)

, ca_state char(2)

, ca_zip char(10))

organization external (

TYPE ORACLE_HIVE

DEFAULT DIRECTORY DEFAULT_DIR

ACCESS PARAMETERS

(com.oracle.bigdata.cluster hadoop_cl_1)

LOCATION ('hive://customer_address')

)

HDFS Data Node

HDFS Name Node

Hive metadata

External Table

Hive metadata

Big Data SQL Query all data with Oracle SQL Smart scan in Hadoop to optimize data requests

Page 50: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data SQL

60

SELECT w.sess_id, c.name FROM web_logs w, customers c WHERE w.source_country = ‘Brazil’ AND w.cust_id = c.customer_id;

Relevant SQL runs on BDA nodes

10’s of Gigabytes of Data

Only columns and rows needed to answer query are returned

Hadoop Cluster

B B B

Big Data SQL

Oracle Database

CUSTOMERS WEB_LOGS

Fast Smart Scan

Massive Parallelism

Storage Indexes

Filtered Locally

Minimized Data Movement

Intelligent Query Optimization

Page 51: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle’s Vision of a Big Data Management System

Oracle Confidential – Internal/Restricted/Highly Restricted 61

One fast SQL query , on all your data.

Oracle SQL on Hadoop and beyond • With a Smart Scan service as in Exadata • Without federation or fragmented stores • With the security and certainty of Oracle Database

Page 52: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Big Data and Fast Data Integrated Solution Stack

Decide

Oracle Coherence

Oracle Event Processing

Oracle BAM

Fast Data: Real Time Streaming – Filtering – Pattern M. – Monitoring

Oracle Real -Time Decisions

Big Data Management System: Acquire – Organize – Analyze

In-D

atab

ase

An

alyt

ics

In-Memory Columnar Store

Oracle Advanced Analytics

Oracle Database

Ap

plic

atio

ns

Oracle NoSQL Database

Cloudera Hadoop

Oracle R Distribution

Oracle Big Data Connectors

Oracle Data Integrator

Oracle BI Enterprise Edition

Endeca Information Discovery

Data Sources

Oracle Big Data SQL

Page 53: Oracle Big Data y Database Analytics - Jordi Trill

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Unified Data Platform

Advanced Query & Analysis Full Power of SQL and Advanced Analytics

Transparent to Applications No Changes to Application Code

Single View of All Data Unified Metadata Across RDBMS & Hadoop

Fastest Performance Utilize SQL Processing Across the Platform

Leverage Existing Skills Lower Cost & Complexity of Big Data Adoption

Page 54: Oracle Big Data y Database Analytics - Jordi Trill