Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Expanding the Horizons of Cloud Computing Beyond the Data Center
Jon Weissman and Abhishek Chandra
(+ great UMn students)
Department of CS&E
University of Minnesota
Key Trends
• Client technology
– devices: smart phones, ipods, tablets, sensors
• Big data
– located at the network edge
• Privacy/trust
– local clouds
• Multiple DCs/clouds
– global services
Minnesota Cloud Research
• Eye towards cloud evolution
• Projects
– Nebula
– DMapReduce
– Mobile cloud
– Proxy cloud
– Active cloud storage
– Virtualization
Project Clusters
• Power at the edge
– big data, locality
• Cloud-2-Cloud
– multiple clouds/DCs
• Mobile user
– user-centric cloud
Nebula
Mobile cloud
DMapReduce
Big Data Trend
• Big data is distributed
– earth science: weather data, seismic data
– life science: GenBank, NCI BLAST, PubMed
– health science: GoogleEarth + CDC pandemic data
– web 2.0: user multimedia blogs
– “everyone is a sensor”
• Cost in moving data to the cloud
Big Data Trend: Nebulas
• Process data “close by”
– fully and/or on-route to the central cloud
– cost: save time and money
– privacy (think: patient records)
• Close by
– network distance
– trusted peers
Example: Dispersed-Data-Intensive Services
Data is geographically distributed Costly, inefficient to move to central location
Example Instance: Blog Analysis
blog1 blog2
blog3
Nebula: A New Cloud Model
• Make the cloud more “distributed”
– exploit the rich collection of edge computers
– volunteers (P2P, @home), commercial (CDNs)
– enormous computing potential, network diversity
– lower latency: “on demand”, native client sandboxing
Nebula Central
Example: Blog Analysis
blog1 blog2
blog3
Blog Results
0
20000
40000
60000
80000
100000
120000
140000
40 80 120 160 240 320
Tim
e t
ake
n (
sec)
Amazon emulator Nebula testbed
# Blogs
Current Status
• Prototype running on PlanetLab
• Distributed Data-Store Service
• Network Dashboard Service
• Node software stack
– native client
Nebula Going Forward
• Organize Nebulas
– around trusted peers
– social groups
– communities of interest
– local resources
• Nebula + commercial cloud
“use the edge opportunistically”
Big Data Trend: DMapReduce
14
Input
Data
Data Push
Output
Data
Map Reduce
Big Data Trend: Distribution • Big data is distributed
– earth science: weather data, seismic data
– life science: GenBank, NCI BLAST, PubMed
– health science: GoogleEarth + CDC pandemic data
– web 2.0: user multimedia blogs
Wide-Area MapReduce
DFS push
• Data in different data-centers
• Run MapReduce across them
• Data-flow spanning wide-area networks
17
Option 1: Local MapReduce
Data
Source
(US)
Data
Source
(EU)
Data Center (US) Data Center (EU)
MapReduce
Job
Final Result
Data Push
(Fast)
Data Push
(Slow)
18
Option 2: Global MapReduce
Data
Source
(US)
Data
Source
(EU)
Data Center (US) Data Center (EU)
MapReduce Job Final
Result
Data Push
(Fast)
Data Push
(Slow)
Data Push
(Fast)
Data Push
(Slow)
19
Option 3: Distributed MapReduce
Data
Source
(US)
Data
Source
(EU)
Data Center (US) Data Center (EU)
MapReduce
Job
Final Result
Data Push
(Fast)
MapReduce
Job
Combine Results
Data Push
(Fast)
PlanetLab: WordCount (text data)
• Distributed MR works best in presence of data aggregation
20
0
200
400
600
800
1000
1200
1400
1600
1800
Push US Push EU Map Reduce ResultCombine
Total
Tim
e in s
eco
nds
Local MR
Global MR
Distributed MR
PlanetLab: WordCount (random data)
• Local MR works best in presence of data ballooning
21
0
100
200
300
400
500
600
700
800
900
Push US Push EU Map Reduce ResultCombine
Total
Tim
e in s
eco
nds
Local MR
Global MR
Distributed MR
EC-2 Results
22
0100200300400500600700800900
1000
Push US Push EU Map Reduce ResultCombine
Total
Tim
e in s
eco
nds Local MR
Distributed MR
0
100
200
300
400
500
600
700
800
900
Push US Push EU Map Reduce ResultCombine
Total
Tim
e in S
eco
nds Local MR
Distributed MR
0
50
100
150
200
250
300
Push US Push EU Map Reduce ResultCombine
Total
Tim
e in S
eco
nds
Local MR
Distributed MR
WordCount (Text) WordCount (Random)
Sort
Intelligent Data Placement
• HDFS push
– local node, same rack, random rack
Data placement
Scheduling
Resource Topology Application
Characteristics
/DCi/rackA/nodeX Data expansion factors input->intermediate, a
Intermediate->output, b
select LMR, DMR, GMR
Generalize beyond HDFS/MapReduce!
Mobility Trend: Mobile Cloud
• Mobile users/applications: phones, tablets
– resource limited: power, CPU, memory
– applications are becoming sophisticated
• Improve mobile user experience
– performance, reliability, fidelity
– tap into the cloud dynamically based on current resource state, preferences, interests, privacy
Cloud Mobile Opportunity
• Dynamic outsourcing
– move computation, data to the cloud dynamically
• User context
– exploit user behavior to pre-fetch, pre-compute, cache
• Multi-user sharing
– discover implicit cloud sharing based on interests, social ties
Outsourcing
• Partitioning across (Mobile <->Server <-> Cloud)
– local data capture + cloud processing
– images/video, speech, digital design, aug. reality
Server Server Server Server Server
Proxy Code repository
….
….
Application Profiler
Outsourcing Client
Outsourcing Controller
cloud end (EC-2) mobile end (Android)
Experimental Results -Image Sharpening
• Response time
– both WIFI & 3G
– up to 27× speedup
– 219K, WIFI
• Power consumption
– save up to 9× times
– 219K, WIFI
27
Avg. Time
Avg. Power
Cloud Speculation
• Dynamic user profile
– contains activities in time and space
– “read nytimes.com bet. 9-9:30am on the train; likes technology articles”
– associate with context
• Patterns are relationships between activities
– repetitive, sequential, concurrent, time-bounded
– “user always does X and then does Y”
• Exploiting patterns: pre-fetching, pre-computing, caching in the cloud
Smart cloud
Back-end speculation and optimization
Summary
• Cloud Evolution
– mobile users, big data, privacy/trust, global
• Our vision of the Cloud
– locality of users, data, other clouds/data centers, user-centric behavior