Expanding the Horizons of Cloud Computing Beyond the Data ...€¦ · Expanding the Horizons of...

Preview:

Citation preview

Expanding the Horizons of Cloud Computing Beyond the Data Center

Jon Weissman and Abhishek Chandra

(+ great UMn students)

Department of CS&E

University of Minnesota

Key Trends

• Client technology

– devices: smart phones, ipods, tablets, sensors

• Big data

– located at the network edge

• Privacy/trust

– local clouds

• Multiple DCs/clouds

– global services

Minnesota Cloud Research

• Eye towards cloud evolution

• Projects

– Nebula

– DMapReduce

– Mobile cloud

– Proxy cloud

– Active cloud storage

– Virtualization

Project Clusters

• Power at the edge

– big data, locality

• Cloud-2-Cloud

– multiple clouds/DCs

• Mobile user

– user-centric cloud

Nebula

Mobile cloud

DMapReduce

Big Data Trend

• Big data is distributed

– earth science: weather data, seismic data

– life science: GenBank, NCI BLAST, PubMed

– health science: GoogleEarth + CDC pandemic data

– web 2.0: user multimedia blogs

– “everyone is a sensor”

• Cost in moving data to the cloud

Big Data Trend: Nebulas

• Process data “close by”

– fully and/or on-route to the central cloud

– cost: save time and money

– privacy (think: patient records)

• Close by

– network distance

– trusted peers

Example: Dispersed-Data-Intensive Services

Data is geographically distributed Costly, inefficient to move to central location

Example Instance: Blog Analysis

blog1 blog2

blog3

Nebula: A New Cloud Model

• Make the cloud more “distributed”

– exploit the rich collection of edge computers

– volunteers (P2P, @home), commercial (CDNs)

– enormous computing potential, network diversity

– lower latency: “on demand”, native client sandboxing

Nebula Central

Example: Blog Analysis

blog1 blog2

blog3

Blog Results

0

20000

40000

60000

80000

100000

120000

140000

40 80 120 160 240 320

Tim

e t

ake

n (

sec)

Amazon emulator Nebula testbed

# Blogs

Current Status

• Prototype running on PlanetLab

• Distributed Data-Store Service

• Network Dashboard Service

• Node software stack

– native client

Nebula Going Forward

• Organize Nebulas

– around trusted peers

– social groups

– communities of interest

– local resources

• Nebula + commercial cloud

“use the edge opportunistically”

Big Data Trend: DMapReduce

14

Input

Data

Data Push

Output

Data

Map Reduce

Big Data Trend: Distribution • Big data is distributed

– earth science: weather data, seismic data

– life science: GenBank, NCI BLAST, PubMed

– health science: GoogleEarth + CDC pandemic data

– web 2.0: user multimedia blogs

Wide-Area MapReduce

DFS push

• Data in different data-centers

• Run MapReduce across them

• Data-flow spanning wide-area networks

17

Option 1: Local MapReduce

Data

Source

(US)

Data

Source

(EU)

Data Center (US) Data Center (EU)

MapReduce

Job

Final Result

Data Push

(Fast)

Data Push

(Slow)

18

Option 2: Global MapReduce

Data

Source

(US)

Data

Source

(EU)

Data Center (US) Data Center (EU)

MapReduce Job Final

Result

Data Push

(Fast)

Data Push

(Slow)

Data Push

(Fast)

Data Push

(Slow)

19

Option 3: Distributed MapReduce

Data

Source

(US)

Data

Source

(EU)

Data Center (US) Data Center (EU)

MapReduce

Job

Final Result

Data Push

(Fast)

MapReduce

Job

Combine Results

Data Push

(Fast)

PlanetLab: WordCount (text data)

• Distributed MR works best in presence of data aggregation

20

0

200

400

600

800

1000

1200

1400

1600

1800

Push US Push EU Map Reduce ResultCombine

Total

Tim

e in s

eco

nds

Local MR

Global MR

Distributed MR

PlanetLab: WordCount (random data)

• Local MR works best in presence of data ballooning

21

0

100

200

300

400

500

600

700

800

900

Push US Push EU Map Reduce ResultCombine

Total

Tim

e in s

eco

nds

Local MR

Global MR

Distributed MR

EC-2 Results

22

0100200300400500600700800900

1000

Push US Push EU Map Reduce ResultCombine

Total

Tim

e in s

eco

nds Local MR

Distributed MR

0

100

200

300

400

500

600

700

800

900

Push US Push EU Map Reduce ResultCombine

Total

Tim

e in S

eco

nds Local MR

Distributed MR

0

50

100

150

200

250

300

Push US Push EU Map Reduce ResultCombine

Total

Tim

e in S

eco

nds

Local MR

Distributed MR

WordCount (Text) WordCount (Random)

Sort

Intelligent Data Placement

• HDFS push

– local node, same rack, random rack

Data placement

Scheduling

Resource Topology Application

Characteristics

/DCi/rackA/nodeX Data expansion factors input->intermediate, a

Intermediate->output, b

select LMR, DMR, GMR

Generalize beyond HDFS/MapReduce!

Mobility Trend: Mobile Cloud

• Mobile users/applications: phones, tablets

– resource limited: power, CPU, memory

– applications are becoming sophisticated

• Improve mobile user experience

– performance, reliability, fidelity

– tap into the cloud dynamically based on current resource state, preferences, interests, privacy

Cloud Mobile Opportunity

• Dynamic outsourcing

– move computation, data to the cloud dynamically

• User context

– exploit user behavior to pre-fetch, pre-compute, cache

• Multi-user sharing

– discover implicit cloud sharing based on interests, social ties

Outsourcing

• Partitioning across (Mobile <->Server <-> Cloud)

– local data capture + cloud processing

– images/video, speech, digital design, aug. reality

Server Server Server Server Server

Proxy Code repository

….

….

Application Profiler

Outsourcing Client

Outsourcing Controller

cloud end (EC-2) mobile end (Android)

Experimental Results -Image Sharpening

• Response time

– both WIFI & 3G

– up to 27× speedup

– 219K, WIFI

• Power consumption

– save up to 9× times

– 219K, WIFI

27

Avg. Time

Avg. Power

Cloud Speculation

• Dynamic user profile

– contains activities in time and space

– “read nytimes.com bet. 9-9:30am on the train; likes technology articles”

– associate with context

• Patterns are relationships between activities

– repetitive, sequential, concurrent, time-bounded

– “user always does X and then does Y”

• Exploiting patterns: pre-fetching, pre-computing, caching in the cloud

Smart cloud

Back-end speculation and optimization

Summary

• Cloud Evolution

– mobile users, big data, privacy/trust, global

• Our vision of the Cloud

– locality of users, data, other clouds/data centers, user-centric behavior