139
CPSC 426/526 Cloud Computing Ennan Zhai Computer Science Department Yale University

Cloud Computing - Zoo Computing Ennan Zhai ... • What techniques are used to support cloud? -Internet-Smart and cheap personal devices ... Zoo? •--•---•--•-

  • Upload
    doannhu

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

CPSC 426/526Cloud Computing

Ennan ZhaiComputer Science Department

Yale University

Recall: Lec-7

• In the lec-7, I talked about:- P2P vs Enterprise control- Firewall - NATs- Software defined network

Lecture Roadmap

• Cloud Computing Overview• Challenges in the Clouds• Distributed File Systems: GFS• Data Process & Analysis: MapReduce• Database: BigTable

What’s the Cloud Computing

What’s the Cloud Computing

Cloud computing is a business model for enabling convenient network access to a shared pool of configurable resources which can be rapidly provisioned and released with minimal management effort or service provider interaction.

--- according to NIST(National Institute of Standards and Technology)

Have You Used the Cloud?

Have You Used the Cloud?

Have You Used the Cloud?

Have You Used the Cloud?

Why We Like It?

• Why users like it? - Do not care where it is, it is “just there”- Access from “any” platform

Why We Like It?

• Why users like it? - Do not care where it is, it is “just there”- Access from “any” platform

Why We Like It?

Cloud Services v.s. Traditional Distributed Systems

Why We Like It?

• Why users like it? - Do not care where it is, it is “just there”- Access from “any” platform

• Why CS researchers like it? - High-performance computation with less money- Lots of hard and interesting new challenges

Building Blocks• What techniques are used to support cloud?

- Internet- Smart and cheap personal devices- Robust and scalable software systems- Virtualization- ... ...

Types of Cloud Services

• Three types of services:- Software as a Service (SaaS)

- Analogy: Restaurant. Prepares&serves entire meal, does the dishes, etc

- Platform as a Service (PaaS)- Analogy: Take-out food. Prepares meal but does not serve it.

- Infrastructure as a Service (IaaS)- Analogy: Grocery store. Provides raw ingredients.

Types of Cloud Services

• Three types of services:- Software as a Service (SaaS)

- Analogy: Restaurant. Prepares&serves entire meal, does the dishes, etc

- Platform as a Service (PaaS)- Analogy: Take-out food. Prepares meal but does not serve it.

- Infrastructure as a Service (IaaS)- Analogy: Grocery store. Provides raw ingredients.

Types of Cloud Services

• Three types of services:- Software as a Service (SaaS)

- Analogy: Restaurant. Prepares&serves entire meal, does the dishes, etc

- Platform as a Service (PaaS)- Analogy: Take-out food. Prepares meal but does not serve it.

- Infrastructure as a Service (IaaS)- Analogy: Grocery store. Provides raw ingredients.

Types of Cloud Services

• Three types of services:- Software as a Service (SaaS)

- Analogy: Restaurant. Prepares&serves entire meal, does the dishes, etc

- Platform as a Service (PaaS)- Analogy: Take-out food. Prepares meal but does not serve it.

- Infrastructure as a Service (IaaS)- Analogy: Grocery store. Provides raw ingredients.

Software as a Service (SaaS)

Hardware

Middleware

Application

Cloud Provider (i.e., SaaS Provider)

Software as a Service (SaaS)

Hardware

Middleware

Application

Cloud Provider (i.e., SaaS Provider)

• SaaS provider offers an entire application- Word processor, spreadsheet, CRM software, etc.- Customer pays cloud provider- Example: Google Apps, Salesforce.com, etc.

Software as a Service (SaaS)

Hardware

Middleware

Application

Cloud Provider (i.e., SaaS Provider)

• SaaS provider offers an entire application- Word processor, spreadsheet, CRM software, etc.- Customer pays cloud provider- Example: Google Apps, Salesforce.com, etc.

Software as a Service (SaaS)

Hardware

Middleware

Application

Customer

Cloud Provider (i.e., SaaS Provider)

• SaaS provider offers an entire application- Word processor, spreadsheet, CRM software, etc.- Customer pays cloud provider and uses the service- Example: Google Apps, Salesforce.com, etc.

Software as a Service (SaaS)

Hardware

Middleware

Application

Customer

Cloud Provider (i.e., SaaS Provider)

• SaaS provider offers an entire application- Word processor, spreadsheet, CRM software, etc.- Customer pays cloud provider and uses the service- Example: Google Apps, Salesforce.com, etc.

Software as a Service (SaaS)

Hardware

Middleware

Application

Customer

Cloud Provider (i.e., SaaS Provider)

• SaaS provider offers an entire application- Word processor, spreadsheet, CRM software, etc.- Customer pays cloud provider and uses the service- Example: Google Apps, Salesforce.com, etc.

Software as a Service (SaaS)

SaaS Example: Gmail

Hardware

Middleware

Application

Gmail Provider

SaaS Example: Gmail

Hardware

Middleware

Application

Gmail Provider

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

SaaS Example: Gmail

Hardware

Middleware

Application

Gmail Provider

BigTable

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

SaaS Example: Gmail

Hardware

Middleware

Application

Gmail Provider

BigTable

BigTable APIs

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

SaaS Example: Gmail

Hardware

Middleware

Application

Gmail Provider

Gmail

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

BigTable

BigTable APIs

SaaS Example: Gmail

Hardware

Middleware

Application

Customer

Gmail Provider

Gmail

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

BigTable

BigTable APIs

SaaS Example: Gmail

Hardware

Middleware

Application

Customer

Gmail Provider

Gmail

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

BigTable

BigTable APIs

SaaS Example: Gmail

Hardware

Middleware

Application

Customer

Gmail Provider

Gmail

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

BigTable

BigTable APIs

SaaS Example: Gmail

Hardware

Middleware

Application

Customer

Gmail Provider

Gmail

• Outsourcing your e-mail software: - Distributed, replicated message store in BigTable- Weak consistency model for some operations (e.g., msg read)- Stronger consistency for others (e.g., send msg)

BigTable

BigTable APIs

SaaS Example: Gmail

Platform as a Service (PaaS)

Hardware

Middleware

Application

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- Customer pays SaaS provider for the service- SaaS provider pays the cloud for the platform- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

Hardware

Middleware

Application

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- Customer pays SaaS provider for the service- SaaS provider pays the cloud for the platform- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

Hardware

Middleware

Application

App Provider

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- App provider pays the cloud for the platform- Customer pays App provider for the service- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

Hardware

Middleware

Application

App Provider

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- App provider pays the cloud for the platform- Customer pays App provider for the service- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

Hardware

Middleware

Application

CustomerApp Provider

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- App provider pays the cloud for the platform- Customer pays app provider for the service- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

Hardware

Middleware

Application

CustomerApp Provider

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- App provider pays the cloud for the platform- Customer pays app provider for the service- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

Hardware

Middleware

Application

CustomerApp Provider

• Cloud provides middleware/infrastructure- For example, Microsoft Common Language Runtime (CLR)- App provider pays the cloud for the platform- Customer pays app provider for the service- Example: Windows Azure, Google App Engine, etc.

Cloud Provider (i.e., PaaS Provider)

Application

Platform as a Service (PaaS)

PaaS Example: Facebook

Hardware

Middleware

Application

Facebook Provider

PaaS Example: Facebook

Hardware

Middleware

Application

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- Third-party game applications- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook Provider

PaaS Example: Facebook

Hardware

Middleware

Application

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- Third-party game applications- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook APIs

Facebook Clusters

Facebook Provider

PaaS Example: Facebook

Hardware

Middleware

Application

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- Third-party game applications- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook APIs

Facebook Clusters

Facebook Provider

PaaS Example: Facebook

Hardware

Middleware

Application

App Provider

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- App providers adopt their services (e.g., game) onto Facebook- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook Game

Facebook APIs

Facebook Clusters

Facebook Provider

PaaS Example: Facebook

Hardware

Middleware

Application

App Provider

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- App providers adopt their services (e.g., game) onto Facebook- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook APIs

Facebook Clusters

Facebook Provider

Facebook Game

PaaS Example: Facebook

Hardware

Middleware

Application

CustomerApp Provider

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- App providers adopt their services (e.g., game) onto Facebook- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook Game

Facebook APIs

Facebook Clusters

Facebook Provider

Facebook Game

PaaS Example: Facebook

Hardware

Middleware

Application

CustomerApp Provider

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- App providers adopt their services (e.g., game) onto Facebook- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook Game

Facebook APIs

Facebook Clusters

Facebook Provider

Facebook Game

PaaS Example: Facebook

Hardware

Middleware

Application

CustomerApp Provider

• Facebook offers PaaS capabilities to App provider- Facebook APIs allow access to social network properties- App providers adopt their services (e.g., game) onto Facebook- Facebook itself also uses PaaS provided by its company, e.g., log

analysis for recommendations

Facebook Game

Facebook APIs

Facebook Clusters

Facebook Provider

Facebook Game

PaaS Example: Facebook

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Middleware

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- Customer pays SaaS provider for the service- SaaS provider pays the cloud for the resources- Example: Amazon Web Services, Rackspace Cloud, etc.

Cloud Provider (i.e., IaaS Provider)

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Middleware

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- Customer pays SaaS provider for the service- SaaS provider pays the cloud for the resources- Example: Amazon Web Services, Rackspace Cloud, etc.

Cloud Provider (i.e., IaaS Provider)

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Middleware

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- App provider pays the cloud for the resources- Customer pays App provider for the service- Example: Amazon Web Services, Rackspace Cloud, etc.

App Provider

Cloud Provider (i.e., IaaS Provider)

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Middleware

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- App provider pays the cloud for the resources- Customer pays App provider for the service- Example: Amazon Web Services, Rackspace Cloud, etc.

App Provider

Cloud Provider (i.e., IaaS Provider)

Middleware

Application

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Customer

Middleware

App Provider

Cloud Provider (i.e., IaaS Provider)

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- App provider pays the cloud for the resources- Customer pays App provider for the service- Example: Amazon Web Services, Rackspace Cloud, etc.

Middleware

Application

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Customer

Middleware

App Provider

Cloud Provider (i.e., IaaS Provider)

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- App provider pays the cloud for the resources- Customer pays App provider for the service- Example: Amazon Web Services, Rackspace Cloud, etc.

Middleware

Application

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Customer

Middleware

App Provider

Cloud Provider (i.e., IaaS Provider)

• Cloud provides raw computing resources- Virtual machines, blade servers, hard disk, etc.- App provider pays the cloud for the resources- Customer pays App provider for the service- Example: Amazon Web Services, Rackspace Cloud, etc.

Middleware

Application

Infrastructure as a Service (IaaS)

Hardware

Middleware

Application Application

Middleware

Amazon

IaaS Example: EC2 and S3

Hardware

Middleware

Application Application

Middleware

Amazon

EC2 S3

IaaS Example: EC2 and S3

Hardware

Middleware

Application Application

Middleware

Amazon

EC2 S3

Netflix Provider

• Netflix (app) heavily depends on Amazon AWS: - Media files are stored in S3- Transcoding to target devices (e.g., iPad) using EC2- Analysis of streaming sessions based on Elastic MapReduce

IaaS Example: EC2 and S3

Hardware

Middleware

Application

Middleware

Amazon

EC2 S3

Netflix Provider

• Netflix (app) heavily depends on Amazon AWS: - Media files are stored in S3- Transcoding to target devices (e.g., iPad) using EC2- Analysis of streaming sessions based on Elastic MapReduce

IaaS Example: EC2 and S3

Hardware

Middleware

Application

Middleware

Amazon

EC2 S3

Netflix Provider

• Netflix (app) heavily depends on Amazon AWS: - Media files are stored in S3- Transcoding to target devices (e.g., iPad) using EC2- Analysis of streaming sessions based on Elastic MapReduce

Netflix

IaaS Example: EC2 and S3

Hardware

Middleware

Application

Middleware

Amazon

EC2 S3

Netflix Provider

• Netflix (app) heavily depends on Amazon AWS: - Media files are stored in S3- Transcoding to target devices (e.g., iPad) using EC2- Analysis of streaming sessions based on Elastic MapReduce

Netflix

IaaS Example: EC2 and S3

Hardware

Middleware

Application Application

Middleware

Amazon

EC2 S3

CustomerNetflix Provider

• Netflix (app) heavily depends on Amazon AWS: - Media files are stored in S3- Transcoding to target devices (e.g., iPad) using EC2- Analysis of streaming sessions based on Elastic MapReduce

Netflix

IaaS Example: EC2 and S3

Hardware

Middleware

Application Application

Middleware

Amazon

EC2 S3

CustomerNetflix Provider

• Netflix (app) heavily depends on Amazon AWS: - Media files are stored in S3- Transcoding to target devices (e.g., iPad) using EC2- Analysis of streaming sessions based on Elastic MapReduce

Netflix

IaaS Example: EC2 and S3

Types of Cloud Services

• Three types of services:- Software as a Service (SaaS)

- Analogy: Restaurant. Prepares&serves entire meal, does the dishes, etc

- Platform as a Service (PaaS)- Analogy: Take-out food. Prepares meal but does not serve it.

- Infrastructure as a Service (IaaS)- Analogy: Grocery store. Provides raw ingredients.

Types of Cloud Services

• Three types of services:- Software as a Service (SaaS)

- Analogy: Restaurant. Prepares&serves entire meal, does the dishes, etc

- Platform as a Service (PaaS)- Analogy: Take-out food. Prepares meal but does not serve it.

- Infrastructure as a Service (IaaS)- Analogy: Grocery store. Provides raw ingredients.

Zoo?

The Major Cloud Providers• Amazon is the big player:

- Infrastructure as a service (e.g., EC2)- Storage as a service (e.g., S3)

• But there are many others:- Microsoft Azure: It has similar services to Amazon, with an

emphasis on .Net programming model- Google App Engine: It offers programming interface, Hadoop,

also software as a service, e.g., Gmail and Google Docs- IBM, HP, Yahoo!: They seem to focus on enterprise scale cloud

apps

The Major Cloud Providers• Amazon is the big player:

- Infrastructure as a service (e.g., EC2)- Storage as a service (e.g., S3)

• But there are many others:- Microsoft Azure: It has similar services to Amazon, with an

emphasis on .Net programming model- Google App Engine: It offers programming interface, Hadoop,

also software as a service, e.g., Gmail and Google Docs- IBM, HP, Yahoo!: They seem to focus on enterprise scale cloud

apps

Lecture Roadmap

• Cloud Computing Overview• Challenges in the Clouds• Distributed File Systems: GFS• Data Process & Analysis: MapReduce• Database: BigTable

Challenges?

In the cloud, we have much more data and users than before

PC

Data! Users! Traffic!

• What if one computer is not enough? - Buy a bigger (server-class) computer

PC

Data! Users! Traffic!

• What if one computer is not enough? - Buy a bigger (server-class) computer

PC Server

Data! Users! Traffic!

• What if one computer is not enough? - Buy a bigger (server-class) computer

PC Server

• What if the biggest computer is not enough? - Buy many computers

Data! Users! Traffic!

• What if one computer is not enough? - Buy a bigger (server-class) computer

PC Server Cluster

• What if the biggest computer is not enough? - Buy many computers

Data! Users! Traffic!

Data! Users! Traffic!

Rack Data! Users! Traffic!

Network switches (connects nodes with each other and with other racks)

Rack Data! Users! Traffic!

Network switches (connects nodes with each other and with other racks)

Many nodes/blades (often identical)

Rack Data! Users! Traffic!

Network switches (connects nodes with each other and with other racks)

Many nodes/blades (often identical)

Storage device(s)

Rack Data! Users! Traffic!

• What if cluster is too big to fit into machine room? - Build a separate building for the cluster- Building can have lots of cooling and power- Result: Data center

PC Server Cluster

Data! Users! Traffic!

• What if cluster is too big to fit into machine room? - Build a separate building for the cluster- Building can have lots of cooling and power- Result: Data center

PC Server Cluster

Data! Users! Traffic!

• What if cluster is too big to fit into machine room? - Build a separate building for the cluster- Building can have lots of cooling and power- Result: Data center

PC Server Cluster Data center

Data! Users! Traffic!

Google’s Datacenter in Oregon

Data centers (size of a football field)

Google’s Datacenter in Oregon

• A warehouse-sized computer - A single data center can easily contain 10,000 racks with

100 cores in each rack (1,000,000 cores total)

Data centers (size of a football field)

Google’s Datacenter in Oregon

Google’s Datacenter Locations

Google’s Datacenter Locations

Google’s Datacenter Locations

Challenges?• How to manage a huge group of data?

- How to store the data? - How to process and extract something from the data?- How to handle multiple availability and consistency?- How to preserve the data privacy?

Example: Google• How to manage a huge group of data?

- How to store the data? - How to process and extract something from the data?- How to handle multiple availability and consistency?- How to preserve the data privacy?

Google File System & BigTable

MapReducePaxos

Example: Google

Example: Google

Google File System (GFS) - Lec 8

Example: Google

Google File System (GFS) - Lec 8

MapReduce - Lec 9 BigTable - Lec 9

Example: Google

Google File System (GFS) - Lec 8

Google Applications, e.g., Gmail and Google Map

MapReduce - Lec 9 BigTable - Lec 9

Another OpenSource Example: Apache

Hadoop Distributed File System (HDFS)

Apache MapReduce Apache HBase

Apache Applications

Lecture Roadmap

• Cloud Computing Overview• Challenges in the Clouds• Distributed File Systems: GFS• Data Process & Analysis: MapReduce• Database: BigTable

The Google File System [SOSP’03]

• GFS aims to offer file services for google applications:- Scalable distributed file system- Designed for large data-intensive applications- Fault-tolerant; runs on commodity hardware- Delivers high performance to a large number of clients

NFS vs GFS

• Single machine makes part of its file system available to other machines

• Sequential or random access• PRO: Simplicity, generality,

transparency

• CON: Storage capacity and throughput limited by single server

• Single virtual file system spread over many machines

• Optimized for sequential read and local accesses

• PRO: High throughput, high capacity

• CON: Specialized for particular types of applications

NFS vs GFS

• Single machine makes part of its file system available to other machines

• Sequential or random access• PRO: Simplicity, generality,

transparency

• CON: Storage capacity and throughput limited by single server

• Single virtual file system spread over many machines

• Optimized for sequential read and local accesses

• PRO: High throughput, high capacity

• CON: Specialized for particular types of applications

Design Assumptions

• Assumptions for conventional file systems do not work- e.g., “most files are small” and “short lifetimes”

• Component failures are norm:- File system = thousands of storage machines- Much more complex cases in practice

• Files are huge. n-GB/TB files are norm- I/O operations and block size choices are affected

Design Assumptions• Most files are appended, not overwritten

- Random writes in a file are quite rare- Once created, files are mostly read; often seq

• Workload:- Large streaming reads- Large appends- Hundreds of processes append to a file concurrently

Design Assumptions• Most files are appended, not overwritten

- Random writes in a file are quite rare- Once created, files are mostly read; often seq

• Workload:- Large streaming reads- Large appends- Hundreds of processes append to a file concurrently

GFS was originally built for web indexing.

File System Interfaces

• Operations- Basic: create/delete/open/close/read/write- Additional: snapshot/append

File System Interfaces

• Operations- Basic: create/delete/open/close/read/write- Additional: snapshot/append

Allow multi-clients to append atomically without locking

GFS Servers

GFS cluster: N Chunkservers + 1 Master

chunk server

Master

chunk server

chunk server

chunk server

chunk server

Data Storage: fixed-size chunks Chunks replicated on several system

File system metadata Maps files to chunks

. . .

GFS Master & Chunkservers

GFS Servers

GFS cluster: N Chunkservers + 1 Master

chunk server

Master

chunk server

chunk server

chunk server

chunk server

Data Storage: fixed-size chunks Chunks replicated on several system

File system metadata Maps files to chunks

. . .

GFS Master & Chunkservers

GFS Servers

GFS cluster: N Chunkservers + 1 Master

chunk server

Master

chunk server

chunk server

chunk server

chunk server

Data Storage: fixed-size chunks Chunks replicated on several system

File system metadata Maps files to chunks

. . .

GFS Master & Chunkservers

GFS Files

checkpoint image

operation log

In-memory FS metadata

chunkserver chunkserver chunkserver

A File is made of 64MB chunks

That are replicated for fault-tolerance

Chunks live on chunkservers

The master manages the file system namespace master

Files in GFS

GFS Files

checkpoint image

operation log

In-memory FS metadata

chunkserver chunkserver chunkserver

A File is made of 64MB chunks

That are replicated for fault-tolerance

Chunks live on chunkservers

The master manages the file system namespace master

Files in GFS

GFS Files

checkpoint image

operation log

In-memory FS metadata

chunkserver chunkserver chunkserver

A File is made of 64MB chunks

That are replicated for fault-tolerance

Chunks live on chunkservers

The master manages the file system namespace master

Files in GFS

GFS Files

checkpoint image

operation log

In-memory FS metadata

chunkserver chunkserver chunkserver

A File is made of 64MB chunks

That are replicated for fault-tolerance

Chunks live on chunkservers

The master manages the file system namespace master

Files in GFS

Chunks and Chunkservers• Chunk size = 64 MB (default)

- 32-bit checksum with each chunk

• Chunk handler- Globally unique 64-bit number- Assigned by the master when creation

• Where are chunks stored?- Local disk as Linux files

• Each chunk is replicated across multiple nodes- Three replicas (default)- More replicas for popular files to avoid hotspots

Master• Maintains all file system metadata

- Namespace, access control info, filename to chunk mapping, current locations of chunks

• Manages- Chunk leases (locks), garbage collection, chunk

migration

• Periodically communicates with all nodes- Via heartbeat messages- To get state and send commands

Client Interaction Model

• GFS client code linked into each application- No OS-level API

- Interacts with mater for metadata operations- Interacts directly with chunkservers for file data

• Clients cache metadata- E.g., location of a file’s chunks

Reading Files

• 1. Contact the master node• 2. Get file’s metadata: list chunk handlers• 3. Get the location of each of the chunk handles

- Multiple replicated chunkservers per chunk

• 4. Contact any available chunkserver for chunk data

Metadata:/usr/data/foo

-> 1, 2, 4/usr/data/bar

-> 3, 5

Master: Stores metadata only

Chunkserver: Store Chunks

1 2 1 5 2 3

5

1 4

35

2 4 3 4

Reading Files

• Master holds metadata for the files• Chunkservers hold the actual chunks

- Each chunk is replicated three times

• When a client wants to read file

Metadata:/usr/data/foo

-> 1, 2, 4/usr/data/bar

-> 3, 5

Master: Stores metadata only

1 2 1 5 2 3

5

1 4

35

2 4 3 4

Reading Files

Clientbar?

Chunkserver: Store Chunks

• Master holds metadata for the files• Chunkservers hold the actual chunks

- Each chunk is replicated three times

• When a client wants to read file

Metadata:/usr/data/foo

-> 1, 2, 4/usr/data/bar

-> 3, 5

Master: Stores metadata only

1 2 1 5 2 3

5

1 4

35

2 4 3 4

Reading Files

Clientbar?

< Chunkserver IPs, 3, 5 >

Chunkserver: Store Chunks

• Master holds metadata for the files• Chunkservers hold the actual chunks

- Each chunk is replicated three times

• When a client wants to read file

Metadata:/usr/data/foo

-> 1, 2, 4/usr/data/bar

-> 3, 5

Master: Stores metadata only

1 2 1 5 2 3

5

1 4

35

2 4 3 4

• Master holds metadata for the files• Chunkservers hold the actual chunks

- Each chunk is replicated three times

• When a client wants to read file

Reading Files

Client

< Chu

nkser

ver IP

s, 3, 5

>

Chunkserver: Store Chunks

Writing to Files• Less frequent than reading• Master grants a chunk lease to one of the replicas

- This replica will be the primary replica chunkserver

- Primary can request lease extensions, if needed- Master increases the chunk version number and informs

replicas

Writing to Files• Phase 1: Send data (Deliver data but do not write to the file)

• A client is given a list of replicas- Identifying the primary and secondaries

• Client writes to the closest replica- Pipeline forwarding

• Chunkservers store this data in a cache

Client Chunkserver1

Chunkserver2

Chunkserver3

Writing to Files• Phase 2: Write data (Commit it to the file)

• Client waits for replicas to ack. receiving data• Send a write request to the primary• The primary is responsible for serialization of writes

(applying then forwarding)• Once all ack. have been received, the primary ack. the client

Client PrimaryChunkserver

SecondaryChunkserver 2

SecondaryChunkserver 3

Writing to Files• Note:

- Data Flow (phase 1) is different from Control Flow (phase 2)

• Data Flow:- Client to chunkserver to chunkserver to chunkserver ...- Ordering does not matter

• Control Flow (commit):- Client to primary to all secondaries

- Ordering maintained

Writing to Files• Note:

- Data Flow (phase 1) is different from Control Flow (phase 2)

• Data Flow:- Client to chunkserver to chunkserver to chunkserver ...- Ordering does not matter

• Control Flow (commit):- Client to primary to all secondaries

- Ordering maintained

Chunk version numbers are used to detect if any replica has stale data (was not updated because it was down)

Files in GFS• Google cluster environment

- Core services: GFS + cluster scheduling system

- Typically 100s to 1000s of active jobs

- 200+ clusters, many with 1000s of machines- Pools of 1000s of clients

Hadoop Distributed File System (HDFS)• A project based on GFS

• An open-source distributed file system - Distributed file storage- The same assumption as GFS

NameNode DataNode2DataNode1 DataNode3 DataNode4 DataNode5

Hadoop Distributed File System (HDFS)

HDFS

NameNode DataNode2DataNode1 DataNode3 DataNode4 DataNode5

Hadoop Distributed File System (HDFS)

HDFS

NameNode DataNode2DataNode1 DataNode3 DataNode4 DataNode5

Data

Hadoop Distributed File System (HDFS)

HDFS

NameNode DataNode2DataNode1 DataNode3 DataNode4 DataNode5

Metadata 1 2 3

Hadoop Distributed File System (HDFS)

HDFS

NameNode DataNode2DataNode1 DataNode3 DataNode4 DataNode5

1 2 3Metadata

Data

Hadoop Distributed File System (HDFS)

HDFS

NameNode DataNode2DataNode1 DataNode3 DataNode4 DataNode5

1 2 1 3 1 2 3 2 3Metadata

Data

Hadoop Distributed File System (HDFS)

Lecture Roadmap

• Cloud Computing Overview• Challenges in the Clouds• Distributed File Systems: GFS• Data Process & Analysis: MapReduce• Database: BigTable