Upload
others
View
26
Download
0
Embed Size (px)
Citation preview
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 1
CS5950 / CS6030Cloud Computing, Big Data
Ajay GuptaB239, CEAS
Computer Science DepartmentWestern Michigan University
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 1
Acknowledgements• I have liberally borrowed these slides and
material from a number of sources including– Web, AWS Educate– MIT, Harvard, UMD, UPenn, UCSD, UW, Clarkson, . .
.– Amazon, Google, IBM, Apache, ManjraSoft,
CloudBook, . . .• Thanks to original authors including Ives, Dyer,
Lin, Dean, Buyya, Ghemawat, Fanelli, Bisciglia, Kimball, Michels-Slettvet,…
• If I have missed any, its purely unintentional. My sincere appreciation to those authors and their creative mind.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 2
What is Cloud Computing…
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 3
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 2
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage• Computing at scale
– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 4WiSe Lab @ WMU
www.cs.wmich.edu/wise
An Example AWS Compute & Network Architecture
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 5
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage• Computing at scale
– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 6WiSe Lab @ WMU
www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 3
How many users and objects?
• Flickr has >6 billion photos
• Facebook has 1.15 billion active users
• Google is serving >1.2 billion queries/day on more than 27 billion items
• >2 billion videos/day watched on YouTubeCloud Computing 7WiSe Lab @ WMU
www.cs.wmich.edu/wise
How much data?• Modern applications use massive data:
– Rendering 'Avatar' movie required >1 petabyte of storage
– eBay has >6.5 petabytes of user data– CERN's LHC will produce about 15 petabytes of
data per year– Since 2008, Google processed 20 petabytes per day– German Climate computing center dimensioned
for 60 petabytes of climate data– Google now designing for 1 exabyte of storage– NSA Utah Data Center is said to have 5 zettabyte (!)
• How much is a zettabyte?– 1,000,000,000,000,000,000,000 bytes– A stack of 1TB hard disks that is 25,400 km high
Cloud Computing 8
25,400 km
WiSe Lab @ WMU www.cs.wmich.edu/wise
How much computation?• No single computer can
process that much data– Need many computers!
• How many computers domodern services need?– Facebook is thought to have more than 60,000 servers– 1&1 Internet has over 70,000 servers– Akamai has 95,000 servers in 71 countries– Intel has ~100,000 servers in 97 data centers– Microsoft reportedly had at least 200,000 servers in 2008 – Google is thought to have more than 1 million servers,
is planning for 10 million (according to Jeff Dean)
Cloud Computing 9WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 4
Why should I care?• Suppose you want to build the next Google• How do you...
– ... download and store billions of web pages and images?– ... quickly find the pages that contain a given set of terms?– ... find the pages that are most relevant to a given search?– ... answer 1.2 billion queries of this type every day?
• Suppose you want to build the next Facebook• How do you...
– ... store the profiles of over 500 million users?– ... avoid losing any of them?– ... find out which users might want to be friends?
• Stay tuned!Cloud Computing 10WiSe Lab @ WMU
www.cs.wmich.edu/wise
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage
• Computing at scale– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 11
NEXT
WiSe Lab @ WMU www.cs.wmich.edu/wise
Scaling up
• What if one computer is not enough?– Buy a bigger (server-class) computer
• What if the biggest computer is not enough?– Buy many computers
Cloud Computing 12
PC Server Cluster
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 5
Clusters
• Characteristics of a cluster:– Many similar machines, close interconnection (same room?)– Often special, standardized hardware (racks, blades)– Usually owned and used by a single organization
Cloud Computing 13
Many nodes/blades(often identical)
Network switch(connects nodes witheach other and with other racks)
Storage device(s)
Rack
WiSe Lab @ WMU www.cs.wmich.edu/wise
Power and cooling• Clusters need lots of power
– Example: 140 Watts per server– Rack with 32 servers: 4.5kW (needs special power
supply!)– Most of this power is converted into heat
• Large clusters need massive cooling– 4.5kW is about 3 space heaters– And that's just one rack!
Cloud Computing 14WiSe Lab @ WMU www.cs.wmich.edu/wise
Scaling up
• What if your cluster is too big (hot, power hungry) to fit into your office building?– Build a separate building for the cluster– Building can have lots of cooling and power– Result: Data center
Cloud Computing 15
PC Server Cluster Data center
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 6
What does a data center look like?
• A warehouse-sized computer– A single data center can easily contain
10,000 racks with 100 cores in each rack (1,000,000 cores total)
Cloud Computing 16
Google data center in The Dalles, Oregon
Data centers(size of a football field)
Coolingplant
WiSe Lab @ WMU www.cs.wmich.edu/wise
What's in a data center?
• Hundreds or thousands of racks
Cloud Computing 17
Source: 1&1
WiSe Lab @ WMU www.cs.wmich.edu/wise
What's in a data center?
• Massive networking
Cloud Computing 18
Source: 1&1
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 7
What's in a data center?
• Emergency power supplies
Cloud Computing 19
Source: 1&1
WiSe Lab @ WMU www.cs.wmich.edu/wise
What's in a data center?
• Massive cooling
Cloud Computing 20
Source: 1&1
WiSe Lab @ WMU www.cs.wmich.edu/wise
Energy matters!
• Data centers consume a lot of energy– Makes sense to build them near sources of cheap electricity– Example: Price per KWh is 3.6ct in Idaho (near hydroelectric
power), 10ct in California (long distance transmission), 18ct in Hawaii (must ship fuel)
– Most of this is converted into heat ® Cooling is a big issue!Cloud Computing 21
Company Servers Electricity CosteBay 16K ~0.6*105 MWh ~$3.7M/yrAkamai 40K ~1.7*105 MWh ~$10M/yrRackspace 50K ~2*105 MWh ~$12M/yrMicrosoft >200K >6*105 MWh >$36M/yrGoogle >500K >6.3*105 MWh >$38M/yrUSA (2006)
Source: Qureshi et al., SIGCOMM 200910.9M 610*105 MWh $4.5B/yr
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 8
Scaling up
• What if even a data center is not big enough?– Build additional data centers– Where? How many?
Cloud Computing 22
PC Server Cluster Data center Network of data centers
WiSe Lab @ WMU www.cs.wmich.edu/wise
Global distribution
• Data centers are often globally distributed– Example above: Google data center locations (inferred)
• Why?– Need to be close to users (physics!)– Cheaper resources– Protection against failures
Cloud Computing 23WiSe Lab @ WMU www.cs.wmich.edu/wise
Trend: Modular data center
• Need more capacity? Just deploy another container!
Cloud Computing 24WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 9
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage
• Computing at scale– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 25
NEXT
WiSe Lab @ WMU www.cs.wmich.edu/wise
Problem #1: Difficult to dimension
• Problem: Load can vary considerably– Peak load can exceed average load by factor 2x-10x [Why?]– But: Few users deliberately provision for less than the peak– Result: Server utilization in existing data centers ~5%-20%!!– Dilemma: Waste resources or lose customers!
Cloud Computing 26
2x-10x
Jobs cannot be completed
Dissatisfiedcustomersleave
Provisioning for the peak load Provisioning below the peak
WiSe Lab @ WMU www.cs.wmich.edu/wise
Problem #2: Expensive• Need to invest many $$$ in hardware
– Even a small cluster can easily cost $100,000– Microsoft recently invested $499 million in a single
data center
• Need expertise– Planning and setting up a large cluster is highly nontrivial– Cluster may require special software, etc.
• Need maintenance– Someone needs to replace faulty hardware, install
software upgrades, maintain user accounts, ...
Cloud Computing 27WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 10
Problem #3: Difficult to scale • Scaling up is difficult
– Need to order new machines, install them, integrate with existing cluster - can take weeks
– Large scaling factors may require major redesign, e.g., new storage system, new interconnect, new building (!)
• Scaling down is difficult– What to do with superfluous hardware?– Server idle power is about 60% of peak ®Energy
is consumed even when no work is being done– Many fixed costs, such as construction
Cloud Computing 28WiSe Lab @ WMU www.cs.wmich.edu/wise
Recap: Computing at scale• Modern applications require huge amounts of processing
and data– Measured in petabytes, millions of users, billions of
objects– Need special hardware, algorithms, tools to work at this
scale• Clusters and data centers can provide the resources we
need– Main difference: Scale (room-sized vs. building-sized)– Special hardware; power and cooling are big concerns
• Clusters and data centers are not perfect– Difficult to dimension; expensive; difficult to scale
Cloud Computing 29WiSe Lab @ WMU www.cs.wmich.edu/wise
The power plant analogy
• It used to be that everyone had their own power source– Challenges are similar to the cluster: Needs large up-front
investment, expertise to operate, difficult to scale up/down...
Cloud Computing 30
Steam engine at Stott Park Bobbin Mill
Waterwheel at the Neuhausen ob Eck Open-Air Museum
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 11
Scaling the power plant
• Then people started to build large, centralized power plants with very large capacity...
Cloud Computing 31WiSe Lab @ WMU www.cs.wmich.edu/wise
Metered usage model
• Power plants are connected to customers by a network
• Usage is metered, and everyone (basically) pays only for what they actually use
Cloud Computing 32
Power source Network Meteringdevice
Customer
WiSe Lab @ WMU www.cs.wmich.edu/wise
Why is this a good thing?• Economies of scale
– Cheaper to run one big power plant than many small ones
• Statistical multiplexing– High utilization!
• No up-front commitment– No investment in generator;
pay-as-you-go model• Scalability
– Thousands of kilowattsavailable on demand; addmore within secondsCloud Computing 33
n Cheaper to run one big data center than many small ones
n High utilization!
n No investment in data center;pay-as-you-go model
n Thousands of computers available on demand; addmore within seconds
Electricity Computing
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 12
What is cloud computing?
Cloud Computing 34
http
://ww
w.dil
bert.
com
/fast/
2013
-06-
29/
WiSe Lab @ WMU www.cs.wmich.edu/wise
What is cloud computing?
Cloud Computing 35
The interesting thing about Cloud Computing is that we've redefined Cloud Computing to include everything that we already do.... I don't understand what we would do differently in the light of Cloud Computing other than change the wording of some of our ads.
Larry Ellison, quoted in the Wall Street Journal, September 26, 2008
A lot of people are jumping on the [cloud] bandwagon, butI have not heard two people say the same thing about it.There are multiple definitions out there of "the cloud".
Andy Isherwood, quoted in ZDnet News, December 11, 2008
WiSe Lab @ WMU www.cs.wmich.edu/wise
So what is it, really?• According to NIST:
• Essential characteristics:– On-demand self service– Broad network access– Resource pooling– Rapid elasticity– Measured service
Cloud Computing 36
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 13
Other terms you may have heard
• Utility computing– The service being sold by a cloud– Focuses on the business model (pay-as-you-go),
similar to classical utility companies
• The Web– The Internet's information sharing model– Some web services run on clouds, but not all
• The Internet– A network of networks.– Used by the web; connects (most) clouds to their
customersCloud Computing 37WiSe Lab @ WMU
www.cs.wmich.edu/wise
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage
• Computing at scale– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 38
NEXT
WiSe Lab @ WMU www.cs.wmich.edu/wise
Everything as a Service• What kind of service does the cloud provide?
– Does it offer an entire application, or just resources?– If resources, what kind / level of abstraction?
• Three types commonly distinguished:– Software as a service (SaaS)
• Analogy: Restaurant. Prepares&serves entire meal, does the dishes, ...
– Platform as a service (PaaS)• Analogy: Take-out food. Prepares meal, but does not serve it.
– Infrastructure as a service (IaaS)• Analogy: Grocery store. Provides raw ingredients.
– Other xaaS types have been defined, but are less common• Desktop, Backend, Communication, Network, Monitoring, ...
Cloud Computing 39WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 14
Software as a Service (SaaS)
• Cloud provides an entire application– Word processor, spreadsheet, CRM software,
calendar...– Customer pays cloud provider– Example: Google Apps, Salesforce.com
Cloud Computing 40
Cloudprovider
User
Hardware
Middleware
Application
WiSe Lab @ WMU www.cs.wmich.edu/wise
Platform as a Service (PaaS)
• Cloud provides middleware/infrastructure– For example, Microsoft Common Language Runtime (CLR)– Customer pays SaaS provider for the service; SaaS provider
pays the cloud for the infrastructure– Example: Windows Azure, Google App Engine
Cloud Computing 41
Cloudprovider
User
Hardware
Middleware
Application
SaaSprovider
WiSe Lab @ WMU www.cs.wmich.edu/wise
Infrastructure as a Service (IaaS)
• Cloud provides raw computing resources– Virtual machine, blade server, hard disk, ...– Customer pays SaaS provider for the service; SaaS
provider pays the cloud for the resources– Examples: Amazon Web Services, Rackspace Cloud, GoGrid
Cloud Computing 42
Cloudprovider
User
Hardware
Middleware
Application
SaaSprovider
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 15
Private/hybrid/community clouds
• Who can become a customer of the cloud?– Public cloud: Commercial service; open to (almost) anyone.
Example: Amazon AWS, Microsoft Azure, Google App Engine
– Community cloud: Shared by several similar organizations.Example: Google's "Gov Cloud"
– Private cloud: Shared within a single organization. Example: Internal datacenter of a large company.
Cloud Computing 43
Focus ofthis class
Is this a'real' cloud?
CompanyA
CompanyB
Public Community Private
WiSe Lab @ WMU www.cs.wmich.edu/wise
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage
• Computing at scale– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 44
NEXT
WiSe Lab @ WMU www.cs.wmich.edu/wise
Examples of cloud applications• Application hosting• Backup and Storage• Content delivery• E-commerce• High-performance computing• Media hosting• On-demand workforce• Search engines• Web hosting
Cloud Computing 45WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 16
Case study: • Animoto: Lets users
create videos from their own photos/music– Auto-edits photos and aligns
them with the music, so it"looks good"
• Built using Amazon EC2+S3+SQS• Released a Facebook app in mid-April 2008
– More than 750,000 people signed up within 3 days– EC2 usage went from 50 machines to 3,500 (x70
scalability!)Cloud Computing 46
Source: Jeff Bezos' talk at Stanford on 4/19/08
http
://an
imot
o.co
m/b
log/co
mpa
ny/a
maz
on-c
om-ce
o-jef
f-bez
os-o
n-an
imot
o/
WiSe Lab @ WMU www.cs.wmich.edu/wise
Case study:• March 19, 2008: Hillary Clinton's official White House schedule
released to the public– 17,481 pages of non-searchable, low-quality PDF– Very interesting to journalists, but would have required
hundreds of man-hours to evaluate– Peter Harkins, Senior Engineer at The Washington Post:
Can we make that data available more quickly, ideally within the same news cycle?
– Tested various Optical Character Recognition (OCR) programs; estimated required speed
– Launched 200 EC2 instances; project was completed within nine hours (!) using 1,407 hours of VM time ($144.62)
– Results available on the web only 26 hours after the release
Cloud Computing 47
http
://aw
s.am
azon
.com
/solut
ions/c
ase-
studie
s/was
hingt
on-p
ost/
WiSe Lab @ WMU www.cs.wmich.edu/wise
Other examples• DreamWorks is using the Cerelink
cloud to render animation movies– Cloud was already used to render parts of
Shrek Forever After and How to Train your Dragon
• CERN is working on a "science cloud" to process experimental data
• Virgin Atlantic is hosting their newtravel portal on Amazon AWS
Cloud Computing 48WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 17
Recap: Utility/cloud computing• Why is cloud computing attractive?
– Analogy to 'classical' utilities (electricity, water, ...)– No up-front investment (pay-as-you-go model)– Low price due to economies of scale– Elasticity - can quickly scale up/down as demand varies
• Different types of clouds– SaaS, PaaS, IaaS; public/private/community clouds
• What runs on the cloud?– Many potential applications: Application hosting,
backup/storage, scientific computing, content delivery, ...– Not yet suitable for certain applications (sensitive data,
compliance requirements)
Cloud Computing 49WiSe Lab @ WMU www.cs.wmich.edu/wise
Is the cloud good for everything?
• No. • Sometimes it is problematic, e.g., because of
auditability requirements
• Example: Processing medical records– HIPAA (Health Insurance Portability and
Accountability Act) privacy and security rule• Example: Processing financial information
– Sarbanes-Oxley act
• Would you put your medical data on the cloud?– Why / why not?Cloud Computing 50WiSe Lab @ WMU
www.cs.wmich.edu/wise
Recap: Cloud applications
• Clouds are good for many things...– Applications that involve large amounts of
computation, storage, bandwidth– Especially when lots of resources are needed
quickly (Washington Post example) or load varies rapidly (TicketLeap example)
• ... but not for all things
Cloud Computing 51WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 18
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage
• Computing at scale– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 52
NEXT
WiSe Lab @ WMU www.cs.wmich.edu/wise
What is virtualization?
• Suppose Alice has a machine with 4 CPUs and 8 GB of memory, and three customers:– Bob wants a machine with 1 CPU and 3GB of memory– Charlie wants 2 CPUs and 1GB of memory– Daniel wants 1 CPU and 4GB of memory
• What should Alice do?
Cloud Computing 53
Alice
Bob
Charlie
DanielPhysical machine
WiSe Lab @ WMU www.cs.wmich.edu/wise
What is virtualization?
• Alice can sell each customer a virtual machine (VM) with the requested resources– From each customer's perspective, it appears as if
they had a physical machine all by themselves (isolation) Cloud Computing 54
Alice
Bob
Charlie
DanielPhysical machine
Virtual machines
Virtual machinemonitor
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 19
How does it work?
• Resources (CPU, memory, ...) are virtualized– VMM ("Hypervisor") has translation tables that map requests
for virtual resources to physical resources– Example: VM 1 accesses memory cell #323; VMM maps this
to memory cell 123.– For which resources does this (not) work?– How do VMMs differ from OS kernels?
Cloud Computing 55
Physical machine
OS 1 OS 2
App
App App
VMM
1 0-99 0-99
1 299-399 100-199
2 0-99 300-399
2 200-299 500-599
2 600-699 400-499
Translation table
VM 1 VM 2
WiSe Lab @ WMU www.cs.wmich.edu/wise
Benefit: Migration
• What if the machine needs to be shut down?– e.g., for maintenance, consolidation, ...– Alice can migrate the VMs to different physical machines
without any customers noticingCloud Computing 56
Alice
Bob
Charlie
Daniel
Physical machines
Virtual machines
Virtual machinemonitor
WiSe Lab @ WMU www.cs.wmich.edu/wise
Benefit: Time sharing
• What if Alice gets another customer?– Multiple VMs can time-share the existing resources– Result: Alice has more virtual CPUs and virtual memory
than physical resources (but not all can be active at the same time)
Cloud Computing 57
Alice
Bob
Charlie
DanielPhysical machine
Virtual machines
Virtual machinemonitor
Emil
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 20
Benefit and challenge: Isolation
• Good: Emil can't access Charlie's data• Bad: What if the load suddenly increases?
– Example: Emil's VM shares CPUs with Charlie's VM, and Charlie suddenly starts a large compute job
– Emil's performance may decrease as a result– VMM can move Emil's software to a different CPU, or
migrate it to a different machine
Cloud Computing 58
Alice
Bob
Charlie
DanielPhysical machine
Virtual machines
VMM
Emil
WiSe Lab @ WMU www.cs.wmich.edu/wise
Recap: Virtualization in the cloud
• Gives cloud provider a lot of flexibility– Can produce VMs with different capabilities– Can migrate VMs if necessary (e.g., for maintenance)– Can increase load by overcommitting resources
• Provides security and isolation– Programs in one VM cannot influence programs in
another• Convenient for users
– Complete control over the virtual 'hardware' (can install own operating system, own applications, ...)
• But: Performance may be hard to predict– Load changes in other VMs on the same physical
machine may affect the performance seen by the customer
Cloud Computing 59WiSe Lab @ WMU www.cs.wmich.edu/wise
Plan for today• AWS starter – EC2, VPC, SecurityGroups, Storage
• Computing at scale– The need for scalability; scale of current services– Scaling up: From PCs to data centers– Problems with 'classical' scaling techniques
• Utility computing and cloud computing– What are utility computing and cloud computing?– What kinds of clouds exist today?– What kinds of applications run on the cloud?– Virtualization: How clouds work 'under the hood'– Some cloud computing challengesCloud Computing 60
NEXT
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 21
10 obstacles and opportunities1. Availability
– What happens to my business ifthere is an outage in the cloud?
2. Data lock-in– How do I move my data from
one cloud to another?
3. Data confidentiality and auditability– How do I make sure that the cloud doesn't leak my
confidential data? – Can I comply with regulations like HIPAA and
Sarbanes/Oxley?Cloud Computing 61
Service Duration Date
S3 6-8 hrs 7/20/08
AppEngine 5 hrs 6/17/08
Gmail 1.5 hrs 8/11/08
Azure 22 hrs 3/13/09
Intuit 36 hrs 6/16/10
EBS >3 days 4/21/11
ECC ~2 hrs 6/30/12
Some cloud outages
WiSe Lab @ WMU www.cs.wmich.edu/wise
10 obstacles and opportunities4. Data transfer bottlenecks
– How do I copy large amounts of data from/to the cloud?
– Example: 10 TB from UC Berkeleyto Amazon in Seattle, WA
– Motivated Import/Export feature on AWS
5. Performance unpredictability– Example: VMs sharing the same
disk ® I/O interference– Example: HPC tasks that require
coordinated schedulingCloud Computing 62
Method Time
Internet (20Mbps)
FedEx
Time to transfer 10TB [AF10]
Primitive Mean perf.
Std dev
Memory bandwidth
1.3GB/s 0.05GB/s (4%)
Disk bandwidth
55MB/s 9MB/s(16%)
Performance of 75 EC2 instancesin benchmarks
45 days
1 day
WiSe Lab @ WMU www.cs.wmich.edu/wise
10 obstacles and opportunities6.Scalable storage
– Cloud model (short-term usage, no up-front cost, infinite capacity on demand) does not fit persistent storage well
7.Bugs in large distributed systems– Many errors cannot be reproduced in smaller
configs8.Scaling quickly
– Problem: Boot time; idle power– Fine-grain accounting?
Cloud Computing 63WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 22
10 obstacles and opportunities9. Reputation fate sharing
– One customer's bad behavior can affect the reputation of others using the same cloud
– Example: Spam blacklisting, FBI raid after criminal activity
10.Software licensing– What if licenses are for specific computers?
• Example: Microsoft Windows– How to scale number of licenses up/down?
• Need pay-as-you-go model as wellCloud Computing 64WiSe Lab @ WMU
www.cs.wmich.edu/wise
Stay tuned
Next you will learn about: Programming at scale; Parallelism, Concurrency and Consistency
Cloud Computing 65WiSe Lab @ WMU www.cs.wmich.edu/wise
Some extra slides with various definitions of cloud, cloud
architecture, etc.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 66
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 23
What is Cloud Computing• http://cloudcomputing.sys-con.com/read/612375_p.htm Twenty-One Experts Define Cloud Computing• It is the infrastructural paradigm shift that is sweeping across the Enterprise IT world, but how is it best defined? I refer of course to 'Cloud
Computing' - the phenomenon that currently has as many definitions as there are squares on a chess-board. To try and narrow it down we bring here a round-up of some recent attempts to bring welcome precision where there risks being unnecessary vagueness. Enjoy!
"What is cloud computing all about? Amazon has coined the word “elasticity” which gives a good idea about the key features: you can scale your infrastructure on demand within minutes or even seconds, instead of days or weeks, thereby avoiding under-utilization (idle servers) and over-utilization (blue screen) of in-house resources. With monitoring and increasing automation of resource provisioning we might one day wake up in a world where we don’t have to care about scaling our Web applications because they can do it alone.” Markus Klems
• "For me the simplest explanation for cloud computing is describing it as, 'internet centric software.' This new cloud computing software model is a shift from the traditional single tenant approach to software development to that of a scalable, multi-tenant, multi-platform, multi-network, and global. This could be as simple as your web based email service or as complex as a globally distributed load balanced content delivery environment.
• I think drawing a distinction on whether its, PaaS, SaaS, HaaS is completely secondary, ultimately all these approaches are attempting to solve the same problems (scale). As software transitions from a traditional desktop deployment model to that of a network & data centric one, "the cloud" will be the key way in which you develop, deploy and manage applications in this new computing paradigm.” - Reuven Cohen
•
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 67
What is Cloud Computing – contd.• "I view cloud computing as a broad array of web-based services aimed at
allowing users to obtain a wide range of functional capabilities on a 'pay-as-you-go' basis that previously required tremendous hardware/software investments and professional skills to acquire. Cloud computing is the realization of the earlier ideals of utility computing without the technical complexities or complicated deployment worries.” Jeff Kaplan
• "People are coming to grips with Virtualization and how it reshapes IT, creates service and software based models, and in many ways changes a lot of the physical layer we are used to. Clouds will be the next transformation over the next several years, building off of the software models that virtualization enabled.” Douglas Gourlay
• "The way I understand it, “cloud computing” refers to the bigger picture…basically the broad concept of using the internet to allow people to access technology-enabled services. According to Gartner, those services must be 'massively scalable' to qualify as true 'cloud computing'. So according to that definition, every time I log into Facebook, or search for flights online, I am taking advantage of cloud computing.” Praising Gaw © 2008 SYS-CON Media Inc.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 68
What is Cloud Computing – contd.• "The “Cloud” concept is finally wrapping peoples’ minds around what is possible
when you leverage web-scale infrastructure (application and physical) in an on-demand way. “Managed Services”, “ASP”, “Grid Computing”, “Software as a Service”, “Platform as a Service”, “Anything as a Service”… all terms that couldn’t get it done. Call it a “Cloud” and everyone goes bonkers. Go figure." - Damon Edwards
"There sure is a lot of confusion when it comes to talking about cloud computing. Yet, it does not need to be so complicated. There really are only three types of services that are cloud based: SaaS, PaaS, and Cloud Computing Platforms. I am not sure being massively scalable is a requirement to fit into any one category."- Brian de Haaff
• "SaaS is one consumer facing usage of cloud computing. While it's something of a semantic discussion it is important for people inside to have an understanding of what it all means. Put simply cloud computing is the infrastructural paradigm shift that enables the ascension of SaaS." - Ben Kepes © 2008 SYS-CON Media Inc.
•WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 69
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 24
What is Cloud Computing – contd.• "The 'cloud' model initially has focused on making the hardware layer
consumable as on-demand compute and storage capacity. This is an important first step, but for companies to harness the power of the cloud, complete application infrastructure needs to be easily configured, deployed, dynamically-scaled and managed in these virtualized hardware environments.” Kirill Sheynkman
• "Cloud computing overlaps some of the concepts of distributed, grid and utility computing, however it does have its own meaning if contextually used correctly. Cloud computing really is accessing resources and services needed to perform functions with dynamically changing needs. An application or service developer requests access from the cloud rather than a specific endpoint or named resource. What goes on in the cloud manages multiple infrastructures across multiple organizations and consists of one or more frameworks overlaid on top of the infrastructures tying them together. The cloud is a virtualization of resources that maintains and manages itself."- Kevin Hartig © 2008 SYS-CON Media Inc.
•WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 70
What is Cloud Computing – contd.• "I was chatting with a customer the other day who was struggling with some
of the implications of cloud computing. The analogy that finally made sense to them is what I will call 'cloud dining.' I am the cook in the house and I am tasked with feeding the family. If my 10-year old is lobbying for Italian, I am cook at home or order out. The decision may also vary from day to day. For instance, I might not have all the ingredients and have to order out, or, like this weekend, it may be 103 outside and cooking at home is not all that appealing. Now, the same can be said for supporting a given application in a cloud computing environment.
In a fully implemented Data Center 3.0 environment, you can decide if an app is run locally (cook at home), in someone else’s data center (take-out) and you can change your mind on the fly in case you are short on data center resources (pantry is empty) or you having environmental/facilities issues (too hot to cook). In fact, with automation, a lot of this can can be done with policy and real-time triggers. For example, during month end processing, you might always shift non-critical apps offsite, or if you pass a certain cooling threshold, you might ship certain processing offsite."- Omar Sultan © 2008 SYS-CON Media Inc.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 71
What is Cloud Computing – contd.• "Clouds are vast resource pools with on-demand resource
allocation. The degree of on-demandness can vary from phone calls to web forms to actual APIs that directly requisition servers. I tend to consider slow forms of requisitioning to be more like traditional datacenters, and the quicker ones to be more cloudy. A public facing API is a must for true clouds.
Clouds are virtualized. On-demand requisitioning implies the ability to dynamically resize resource allocation or moving customers from one physical server to another transparently. This is all difficult or impossible without virtualization.
© 2008 SYS-CON Media Inc. •
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 72
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 25
What is Cloud Computing – contd.• "Cloud computing is ... the user-friendly version of grid computing."
- Trevor Doerksen
• "Most computer savvy folks actually have a pretty good idea of what the term "cloud computing" means: outsourced, pay-as-you-go, on-demand, somewhere in the Internet, etc."- Thorsten von Eicken
• Clouds tend to be priced like utilities (hourly, rather than per-resource), and I think we’ll see this model catching on more and more as computing resources become as cheap and ubiquitous as water, electricity, and gas (well, maybe not gas). However, I think this is a trend, not a requirement. You can certainly have clouds that are priced like pizza, per slice.”- Jan Pritzker
© 2008 SYS-CON Media Inc. WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 73
Cloud Computing????
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 74
What is Cloud Computing – contd.• "In order to discuss some of the issues surrounding The Cloud
concept, I think it is important to place it in historical context. Looking at the Cloud's forerunners, and the problems they encountered, gives us the reference points to guide us through the challenges it needs to overcome before it is adopted.” Paul Wallis
• "The web fanatics and blogosphere would have you believe that all applications will move to the web. Some will, most will not. Reliability, scalability, security, and a host of other issues will prevent most businesses from moving their mission critical applications to hosted services or cloud based services. The risk of failure is too great.
• Amazon is the leader in cloud based services, but even Amazon has experienced down times for its own business. Cloud services will continue to improve. But my guess is the uptake will take longer
than most people predict.”- Don Dodge © 2008 SYS-CON Media
Inc.WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 75
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 26
What is Cloud Computing – contd.• "I would like to propose a 'Cloud Pyramid' to help differentiate the various
Cloud offerings out there. [At the top of the pyramid] users are truly restricted to only what the application is and can do. Some of the notable companies here are the public email providers (Gmail, Hotmail, Quicken Online, etc.). Almost any Software as a Service (SaaS) provider can be lumped into this group.
As you move further down the pyramid, you gain increased flexibility and control but your a still fairly restricted to what you can and cannot do. Within this Category things get more complicated to achieve. Products and companies like Google App Engine, Heroku, Mosso, Engine Yard, Joyent or force.com (SalesForce platform) fall into this segment.
At the bottom of the pyramid are the infrastructure providers like Amazon’s EC2, GoGrid, RightScale and Linode. Companies providing infrastructure enable Cloud Platforms and Cloud Applications. Most companies within this segment operate their own infrastructure, allowing them to provide more features, services and control than others within the pyramid." - Michael Sheehan © 2008 SYS-CON Media Inc.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 76
What is Cloud Computing – contd.• "Today's combination of high-speed networks, sophisticated PC graphics
processors, and fast, inexpensive servers and disk storage has tilted engineers toward housing more computing in data centers. In the earlier part of this decade, researchers espoused a similar, centralized approach called "grid computing." But cloud computing projects are more powerful and crash-proof than grid systems developed even in recent years.” Aaron Ricadela
"When virtualizing applications to be used by people who care nothing about computers or technology - as is mostly the case with Clouds - the key thing we want to virtualize or hide from the user is complexity. Most people want to deal with an application or a service, not software. ... The more intelligent we want [computers and computer applications] to be - that is, intuitive, exhibiting common sense and not making us have to constantly take care of them - the more smart software it will take. But with cloud computing, our expectation is that all that software will be virtualized or hidden from us and taken care of by systems and/or professionals that are somewhere else - out there in The Cloud.” Irving Wladawsky Berger
• © 2008 SYS-CON Media Inc.
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 77
What is Cloud Computing – contd.• "I view cloud computing as a broad array of web-based services
aimed at allowing users to obtain a wide range of functional capabilities on a ‘pay-as-you-go’ basis that previously required tremendous hardware/software investments and professional skills to acquire. Cloud computing is the realization of the earlier ideals of utility computing without the technical complexities or complicated deployment worries.” Ben Kepes
"Cloud computing really comes into focus only when you think about what IT always needs: a way to increase capacity or add capabilities on the fly without investing in new infrastructure, training new personnel, or licensing new software. Cloud computing encompasses any subscription-based or pay-per-use service that, in real time over the Internet, extends IT's existing capabilitities” Bill Martin
• © 2008 SYS-CON Media Inc. WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 78
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 27
Defining Clouds: There are many views for what is cloud computing?
• Over 20 definitions:– http://cloudcomputing.sys-con.com/read/612375_p.htm
• A compromised definition J– "A Cloud is a type of parallel and distributed system consisting of
a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreements established through negotiation between the service provider and consumers.”
• Keywords: Virtualization (VMs), Dynamic Provisioning (negotiation and SLAs), and Web 2.0/3.0 access interface
WiSe Lab @ WMU www.cs.wmich.edu/wise
79Cloud Computing
What is Cloud Computing?
1. Web-scale problems2. Large data centers3. Different models of computing4. Highly-interactive Web applications
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 80
1. Web-Scale Problems• Characteristics:
– Definitely data-intensive– May also be processing intensive
• Examples:– Crawling, indexing, searching, mining the Web– “Post-genomics” life sciences research– Other scientific data (physics, astronomers, etc.)– Sensor networks– Web 2.0 applications– …
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 81
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 28
2. Large Data Centers• Web-scale problems? Throw more machines at it!• Clear trend: centralization of computing resources in
large data centers– Necessary ingredients: fiber, juice, and space– What do Oregon, Iceland, and abandoned mines have in
common?
• Important Issues:– Redundancy– Efficiency– Utilization– Management
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 82
Runtime Environment for ApplicationsDevelopment and Data Processing Platforms
Examples: Windows Azure, Hadoop, Google AppEngine, Aneka
Platform as a Service
Virtualized ServersStorage and Networking
Examples: Amazon EC2, S3, Rightscale, vCloud
Infrastructure as a Service
End user applicationsScientific applications
Office automation, Photo editing, CRM, and Social Networking
Examples: Google Documents, Facebook, Flickr, Salesforce
Software as a ServiceWeb 2.0 Interfaces
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 83
Key Technology: Virtualization
Hardware
Operating System
App App App
Traditional Stack
Hardware
OS
App App App
Hypervisor
OS OS
Virtualized Stack
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 84
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 29
Market-oriented Cloud Architecture: QoS negotiation and SLA-based Resource Allocation
DispatcherVM
MonitorService Request
Monitor
Pricing Accounting
Service Request Examiner and Admission Control
- Customer-driven Service Management- Computational Risk Management- Autonomic Resource Management
Users/Brokers
SLAResource Allocator
Virtual Machines
(VMs)
Physical Machines
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 85
A (Layered) Cloud Architecture
Cloud resources
Virtual Machine (VM), VM Management and Deployment
QoS Negotiation, Admission Control, Pricing, SLA Management, Monitoring, Execution Management, Metering, Accounting, Billing
Cloud programming: environments and toolsWeb 2.0 Interfaces, Mashups, Concurrent and Distributed Programming, Workflows, Libraries, Scripting
Cloud applicationsSocial computing, Enterprise, ISV, Scientific, CDNs, ...
Adaptive M
anagement
CoreMiddleware
User-LevelMiddleware
System level
User level
Autonom
ic / Cloud Econom
y
Apps Hosting Platforms
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 86
3. Different Computing Models
• Utility computing– Why buy machines when you can rent cycles?– Examples: Amazon’s EC2, GoGrid, AppNexus
• Platform as a Service (PaaS)– Give me nice API and take care of the implementation– Example: Google App Engine
• Software as a Service (SaaS)– Just run it for me!– Example: Gmail
“Why do it yourself if you can pay someone to do it for you?”
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 87
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 30
4. Web Applications • A mistake on top of a hack built on sand held
together by duct tape?• What is the nature of software applications?
– From the desktop to the browser– SaaS == Web-based applications– Examples: Google Maps, Facebook
• How do we deliver highly-interactive Web-based applications?– AJAX (asynchronous JavaScript and XML)– For better, or for worse…
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 88
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 89
Web-Scale Problems?
• Don’t hold your breath:– Biocomputing– Nanocomputing– Quantum computing– …
• It all boils down to…– Divide-and-conquer– Throwing more hardware at the problem
Simple to understand… a lifetime to master…
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 90
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 31
Divide and Conquer“Work”
w1 w2 w3
r1 r2 r3
“Result”
“worker” “worker” “worker”
Partition
Combine
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 91
Different Workers
• Different threads in the same core• Different cores in the same CPU• Different CPUs in a multi-processor
system• Different machines in a distributed system
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 92
Choices, Choices, Choices
• Commodity vs. “exotic” hardware• Number of machines vs. processor vs.
cores• Bandwidth of memory vs. disk vs. network• Different programming models
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 93
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 32
Flynn’s TaxonomyInstructions
Single (SI) Multiple (MI)
Dat
a
Mul
tiple
(MD
)
SISD
Single-threaded process
MISD
Pipeline architecture
SIMD
Vector Processing
MIMD
Multi-threaded Programming
Sin
gle
(SD
)
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 94
SISD
D D D D D D D
Processor
Instructions
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 95
SIMD
D0
Processor
Instructions
D0D0 D0 D0 D0
D1
D2
D3
D4
…Dn
D1
D2
D3
D4
…Dn
D1
D2
D3
D4
…Dn
D1
D2
D3
D4
…Dn
D1
D2
D3
D4
…Dn
D1
D2
D3
D4
…Dn
D1
D2
D3
D4
…Dn
D0
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 96
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 33
MIMD
D D D D D D D
Processor
Instructions
D D D D D D D
Processor
Instructions
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 97
Memory Typology: Shared
Memory
Processor
Processor Processor
Processor
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 98
Memory Typology: Distributed
MemoryProcessor MemoryProcessor
MemoryProcessor MemoryProcessor
Network
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 99
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 34
Memory Typology: Hybrid
MemoryProcessor
Network
Processor
MemoryProcessor
Processor
MemoryProcessor
Processor
MemoryProcessor
Processor
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 100
Parallelization Problems• How do we assign work units to workers?• What if we have more work units than workers?• What if workers need to share partial results?• How do we aggregate partial results?• How do we know all the workers have finished?• What if workers die?
What is the common theme of all of these problems?
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 101
General Theme?
• Parallelization problems arise from:– Communication between workers– Access to shared resources (e.g., data)
• Thus, we need a synchronization system!• This is tricky:
– Finding bugs is hard– Solving bugs is even harder
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 102
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 35
Managing Multiple Workers• Difficult because
– (Often) don’t know the order in which workers run– (Often) don’t know where the workers are running– (Often) don’t know when workers interrupt each other
• Thus, we need:– Semaphores (lock, unlock)– Conditional variables (wait, notify, broadcast)– Barriers
• Still, lots of problems:– Deadlock, livelock, race conditions, ...
• Moral of the story: be careful!– Even trickier if the workers are on different machines
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 103
Patterns for Parallelism
• Parallel computing has been around for decades
• Here are some “design patterns” …
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 104
Master/Slaves
slaves
master
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 105
Cloud Computing, 9/10/19
WiSe Lab @ WMU Ajay Gupta www.cs.wmich.edu/wise 36
Producer/Consumer Flow
CP
P
P
C
C
CP
P
P
C
C
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 106
Work Queues
CP
P
P
C
C
shared queue
W W W W W
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 107
Rubber Meets Road• From patterns to implementation:
– pthreads, OpenMP for multi-threaded programming– MPI for clustering computing– …
• The reality:– Lots of one-off solutions, custom code– Write you own dedicated library, then program with it– Burden on the programmer to explicitly manage everything
• MapReduce to the rescue!– (for next time)
WiSe Lab @ WMU www.cs.wmich.edu/wise
Cloud Computing 108