How to transform data into money using Big Data Technologies

APACHEBIG DATACONFERENCE

How to transform data into moneyusing Big Data technologies

After almost a decade developing Big Data projects in Paradigma, through its R+D department

we were early adopters of Spark, which led to the creation of Stratio

THE FIRST SPARK-BASED BIG DATA PLATFORM RELEASED

JORGE LOPEZ-MALLA

After working with traditional

processing methods, I started to

do some R&S Big Data projects

and I fell in love with the Big Data

world. Currently i’m doing some

awesome Big Data projects at

Stratio

MY PROFILE

SKILLS

ALBERTO RODRÍGUEZ DE LEMA

After graduating I've been

programming for more than 10 years.

I’ve built high performance and

scalable web applications for

companies such as Indra Systems,

Prudential and Springer Verlag Ltd.

MY PROFILE

@ardlema

SKILLS

GO TO SPACESTRATIO

OPEN-SOURCE SOLUTIONSOur enterprises solutions are based on open sourcetechnologies

PURE SPARKThe only pure Spark platform,

the only global solution

ENTERPRISE SPARKOn – premise & cloud, our platform is

geared towards helping companies

SPARK-BASED BD PLATFORMThe first Spark-Based big data platform released

OUR CLIENT

MIDDLE EAST TELCO COMPANY

o 9.500 mil. daily events processed

o 9.2 mil. clients

USE CASES

MANAGEMENT & NORMALIZATION OF DATA SOURCES

USE CASES

MANAGEMENT & NORMALIZATION OF DATA SOURCES

USE CASES

NETWORK COVERAGE IMPROVEMENT

USE CASES

PEOPLE GATHERING

USE CASES

PEOPLE GATHERING

USE CASES

DATA MONETIZATION

USE CASES

DATA MONETIZATION

USE CASES

TECHNICAL CHALLENGES

TECHNICAL PROBLEMS

Huge volumenof data

Huge sizeof Data

Distributedprocessing

Hardto read

Recognized patterns

1 2 3 4 5

1 HUGE VOLUME OF DATA

SOLUTIONAPACHE HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

1 HUGE VOLUME OF DATA

9500 mil. csv daily records -> circa 16 Gb

Requirements:

High availability

Concurrent file reads

2 HUGE SIZE OF DATA

SOLUTIONAPACHE PARQUET

2 HUGE SIZE OF DATA

16.5 Gb of daily event information stored as csv text in HDFS

4.3 Gb of daily event information stored as parquet files in HDFS

STORE IMPROVEMENT Circa 70%

2 HUGE SIZE OF DATA

Time to count daily csv events -> 6.2 minutes

Time to count daily Parquet events -> 1 minute

READ PROCESS IMPROVEMENT Circa 80%

3 DISTRIBUTED PROCESSING

SOLUTIONAPACHE SPARK

3 DISTRIBUTED PROCESSING - REQUIREMENTS

Complex algorithmics with the minimum amount of resources

Reduction of the process time in order to obtain data when itstill is used

3 DISTRIBUTED PROCESSING - REQUIREMENTS

Sharing the cluster with legacy processes

Use of legacy outputs processes without does any change

4 HARD TO READ

SOLUTIONSCALA + APACHE SPARK

4 HARD TO READ

Reducing developing time

LOCs dramatically reduced

Number of classes dramatically reduced

Tests and application readability improvements

DSLs make our lives easier

Spark makes Map Reduces jobs even simpler

4 HARD TO READ

5 RECOGNIZED PATTERNS

SOLUTIONAPACHE SPARK

Millons of data processed in order to obtain mathematical models

Applied complex mathematical algorithms to obtain accurate weekly behaviors

5 RECOGNIZED PATTERNS

THANK YOU

UNITED STATES

Tel: (+1) 408 5998830

EUROPE

Tel: (+34) 91 828 64 73

contact@stratio.com

www.stratio.com

How to transform data into money using Big Data Technologies

Documents

Data Warehouse DATA TRANSFORMATION. Extract Transform Insert n Extract data from operational system, transform and insert into data warehouse n Why ETI?

A reversible transform for seismic data · PDF file · 2013-11-12A reversible transform for seismic data processinga ... Therefore, the proposed reversible transform for seismic data

Data operations transform fuels value - OSIsoft

Wavelet Transform and Three-Dimensional Data Compression

USING DATA TO TRANSFORM INTEGRATED SERVICES

INFORMATION DRIVEN INSURER Transform Data into Insight

How Data Products Could Transform Cancer Care

with dplyr Transform data

Using Data to Transform the Customer Experience

DATA VE TRANSFORM MENÜLERİ

DerbyDrinker 071 · Money, Money, Money £90.000 to transform thd Newdigate, West Hallam into a traditional country pub. £150,000 on the Chester- field Arms, Hartshorne. £100,000

How real data will transform predictive models

Transform Your World with Data - OSIsoftcdn.osisoft.com/corp/en/media/presentations/2015/... · Transform Your World with Data ... Glatfelter Paper, Global Data Center Services, HP,

Using data to transform learning

Transform Unstructured Data into Actionable Intelligence

IMAPP: Software to Transform EOS Direct Broadcast Data ... · Transform EOS Direct Broadcast Data into ... – creating true color images tutorial ... Software to Transform EOS Direct

Transform-based Domain Adaptation for Big Data

Transform Unstructured Data Into Relevant Data with IBM StoredIQ

GE Digital Energy · transform big data into personalized, timely and relevant insights that save money, increase grid reliability, and GE brings unparalleled utility expertise from

The transform and data compression handbook