Upload
chloe
View
29
Download
0
Embed Size (px)
DESCRIPTION
Applied Architectures. Eunyoung Hwang. Objectives. How principles have been used to solve challenging problems How architecture can be used to explain and analyze common commercial systems. Outline. Distributed Network-based Applications Limitations REST - PowerPoint PPT Presentation
Citation preview
Applied Architectures
Eunyoung Hwang
2
Objectives
• How principles have been used to solve challenging problems
• How architecture can be used to explain and analyze common commercial systems
3
Outline
• Distributed Network-based Applications– Limitations– REST– Commercial Internet Scale Applications: Google
• Decentralized Architecture– Grid Computing– Cloud Computing– Peer-2-Peer: Napster, Grutella, Skype
4
Fallacies of Distributed Systems Viewpoint
• The network is reliable• Latency is zero• Bandwidth is infinite• The network is secure• Topology does not change• There is one administrator• Transport cost is zero• The network is homogeneous
5
WWW Architecture
• World Wide Web is distributed, decentralized, hypermedia application.
6
Representational State Transfer (REST) Style
User agent Origin ServerRequest
ResponseProxies Gateways
Intermediaries
C C C C
• A set of constraints based on WWW architectural style
7
Process View of a REST-based Architecture
$ $Client+Cache:Client Connector: Server Connector: Server+Cache:
$ $
Origin Servers
User Agent
$$
DNS
$DNS
Proxy
Proxy Gateway
wais
http
orbhttp
http
http http
a
b
c
8
Derivation of REST
9
REST (cont.)
• Constraints– Client-server– Context-free (stateless)– Cache– Code on demand– Layered– Uniform Interface
• Benefits– Efficiency– Scalability– User perceived performance
10
Commercial Internet-Scale Applications
• Google strategy– Simpler storage system offering fewer features– Data Storage and manipulation – A high fault-tolerant platform– Cost effective manner
• Google design– Google distributed file system (GFS)– MapReduce
11
Google File System Architecture
12
MapReduce
• Large-scale data processing• Map
– Take input key/value pair, generate a set of intermediate pair• Reduce
– Merge all intermediate values associated with the same intermediate key
Map (k1, v1) -> list(k2, v2)Reduce (k2, list(v2)) -> list(v2)
• E.g., word frequency– map (URL, contents) -> set of (word, 1) – Reduce (word, 1) -> set of (word, sum)
13
MapReduce Execution Flow
14
MapReduce (cont.)
• Architecture Provides– Automatic parallelization & distribution
– Fault tolerance• Walker failure• Master failure
15
Decentralized Architecture
• Networked applications where there are multiple authorities
• Not a new idea– E.g., web sites, international postal mail
• Designing challenges
16
Grid Computing
• Coordinated resource sharing and computation in a decentralized environment
• Technologies that allow consumers to obtain computing power on demand
• Starts with large-scale federated resources
• Issues – Interoperability– Security
17
Grid Architecture
18
Globus Grid Architecture (recovered)
19
Cloud Computing
• A key computing platform for sharing resources• A specialized distributed computing paradigm
– Massively scalable– Different level of service– Driven by economies of scale– Dynamically configured and delivered on demand
• What makes Cloud Computing interesting now?• Not a new concept!
20
Cloud Computing
21
Three level Services by Cloud Computing
• Infrastructure as a Service(IaaS) – Amazon EC2
• Platform as a Service (PaaS) – Google App Engine
• Software as a Service (Saas) – Salesforce
22
Grid vs Cloud Computing Grid Computing Cloud Computing
General 10 years agoLarge -scale federated systems
Now, new needs to analyze massive dataCommercial large-scale systems
The cost of computing Reliability & Flexibility with same issues
Business Model Pay for units spent Pay on a consumption basis
Architecture 5 layered 4 layered, 3 level services
Compute Model Batch-scheduled queuing system
Shared by all users at the same time
Data Model Virtual Data Triangle model
Security Strict Single sign-on
Simpler and less secureeasy to manage account
23
Peer-to-Peer: LL
24
Napster
•Hybrid client-server/P2P
25
Gnutella
•Pure decentralized P2P
26
Skype
•Overlayed P2P
27
Review
• Distributed Network-based Applications- REST
- Google: GFS, MapReduce
• Decentralized Architecture- Grid Computing- Cloud Computing- Peer-2-Peer: Napster, Grutella, Skype
28
References
• Principled Design of The Modern Web Architecture.– Fielding, Roy T., and Richard N. Taylor.
• Cloud Computing and Grid Computing 360-Degree Compared.– Foster, Ian, and Zhao Yong.
• MapReduce: Simplied Data Processing on Large Clusters– Dean, Jeffrey, and Sanjay Ghemawat.
• The Google File System.– Ghemawat, Sanjay, Howard Gobioff, and Shun-Tak Leung