36
Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and Mobile Platforms, Autumn 2009 (2009/10/19) http://code.google.com/appengine/ Ping Yeh ( 葉平 ), Google, Inc. [email protected]

Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

Lecture 6 – Cloud Application Development, using Google App Engine as an example

922EU3870 – Cloud Computing and Mobile Platforms,

Autumn 2009 (2009/10/19)

http://code.google.com/appengine/

Ping Yeh (葉平 ), Google, Inc. [email protected]

Page 2: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

2

Numbers real world engineers should know

L1 cache reference 0.5 ns

Branch mispredict 5 ns

L2 cache reference 7 ns

Mutex lock/unlock 100 ns

Main memory reference 100 ns

Compress 1 KB with Zippy 10,000 ns

Send 2 KB through 1 Gbps network 20,000 ns

Read 1 MB sequentially from memory 250,000 ns

Round trip within the same data center 500,000 ns

Disk seek 10,000,000 ns

Read 1 MB sequentially from network 10,000,000 ns

Read 1 MB sequentially from disk 30,000,000 ns

Round trip between California and Netherlands 150,000,000 ns

Page 3: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

3

Web Applications and APIs

• A web application = a software that

– input: taken from HTTP and URL

– output: contained in HTML, sent via HTTP

• contained stuff: HTML, javascript, CSS, flash, etc

• A web API = a software that

– input: taken from HTTP and URL

– output: data in a format like XML or JSON sent via HTTP

• Not a clear cut

Page 4: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

4

Instance

Layers of a Web Site

Hardware

Management

Provisioning,

accounting,

monitoring,

start/stop

servers,

...(optional) Virtual Machine

Operating System

Libraries

Application Software

Frontend / Routing

Reverse Proxy

• Software as a Service: The instance on a hardware, may or may not have elasticity

• Hosting service: allocation of hardware or virtual machines, no elasticity.

HTTP, URL HTTP, HTML

Elasticity

• Infrastructure as a Service (IaaS): hardware + Virtual Machine + Frontend/Routing + Management, with elasticity

• Platform as a Service: IaaS + OS + Libraries + Reverse Proxy

Page 5: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

5

Designing a Web Application

WebApplication

FunctionUserExperience

Security

AvailabilityPerformance

Analysis &Admin

What does it do?• Your ideas here

What does it do?• Your ideas here

How to use it?• Flow• Simplicity• Interactivity• etc

How to use it?• Flow• Simplicity• Interactivity• etc

Is my data safe?• Encryption• Password

strength• Data replication• XSRF• Buffer overflow• Privacy measures• etc

Is my data safe?• Encryption• Password

strength• Data replication• XSRF• Buffer overflow• Privacy measures• etc

Is it available when I need it?

• Scalability• DDOS defense• Fault tolerance• Routing• etc

Is it available when I need it?

• Scalability• DDOS defense• Fault tolerance• Routing• etc

How long do I need to wait?

• Caching• Data organization• Static content• etc

How long do I need to wait?

• Caching• Data organization• Static content• etc

How to know user's behavior?

• Logs• Fast analysis

How to manage the app?

• Admin UI• Data {im|ex}port• Internal process

How to know user's behavior?

• Logs• Fast analysis

How to manage the app?

• Admin UI• Data {im|ex}port• Internal process Cloud service can help

Page 6: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

6

Cloud Platforms

• IaaS: Amazon EC2 / S3 / SimpleDB, GoGrid, Sun, RackSpace Cloud, and others.

• Infrastructure software: Eucalyptus

• PaaS

– Engine Yard and Heroku: Ruby on Rails on EC2

– Force.com: for applications that run on salesforce.com

– Google App Engine: Python and Java on GFS / BigTable / Google's frontend

– Microsoft Azure: .NET on Microsoft's infrastructure

• Platform software: AppScale, Vertebra

Page 7: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

7

Google App Enginehttp://code.google.com/appengine/

Page 8: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

8

Google App Engine at a Glance

Page 9: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

9

Design Motivations

• Build on Existing Google Technology

– Frontend, BigTable, GFS

• Provide an Integrated Environment

• Encourage Small Per-Request Footprints

• Encourage Fast, Efficient Requests

• Maintain Isolation Between Applications

• Encourage Statelessness and Specialization

• Require Partitioned Data Model (from BigTable)

Page 10: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

10

Life of an App Engine Request

Page 11: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

11

Routing

• Routed to the nearest Google datacenter

• Travels over Google's network

– Same infrastructure other Google products use

– Lots of advantages for free

Page 12: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

12

Request for Static Content

Google App Engine Front Ends

– Load balancing

– Routing

– Front Ends route static requests to specialized serving infrastructure

Routing at the Front End

Google Static Content Serving

– Built on shared Google Infrastructure

– Static files are physically separate from code files

Static Content Servers

Page 13: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

13

Request For Static Content

– Back to the Front End and out to the user

• Front End handles connection to the user

• Frees up Static Content server

– Specialized infrastructure

• App Server runtimes don't serve static content

– Flexible cache expiration settings

Response to the user

Page 14: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

14

Defining static content

Request for Static Content

...<static>  <include path="/**.png" />  <exclude path="/data/**.png /></static>...

Java Runtime: appengine-web.xml

...- url: /images static_dir: static/imagesOR- url: /images/(.*) static_files: static/images/\1  upload: static/images/(.*)...

Python Runtime: app.yaml

Page 15: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

15

Request for Dynamic Content

Front Ends

– Route dynamic requests to App Servers

App Servers

– Serve dynamic requests

– Where your code runs

App Master

– Schedules applications

– Informs Front Ends

Page 16: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

16

• Checks for cached runtime

– If it exists, no initialization

• Execute request

• Cache the runtime

Consequences / Opportunities

– Slow first request, faster subsequent requests

– Optimistically cache data in your runtime!

Request for Dynamic Content

Page 17: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

17

App Servers - What do they do?

– Many applications

– Many concurrent requests

• Smaller footprint + faster requests = more apps

– Enforce Isolation

• Keeps apps safe from each other

– Enforce statelessness

• Allows for scheduling flexibility

– Service API requests

• Provides access to other services

Page 18: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

18

App Servers

1. App issues API call

2. App Server accepts

3. App Server blocks runtime

4. App Server issues call

5. Returns the response

• Use APIs to do things you don't want to do in your runtime, such as...

Requests accessing APIs

Page 19: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

19

APIs

Page 20: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

20

– Distributed in-memory cache

– Very fast

– memcacheg

• Like memcached

• Also written by Brad Fitzpatrick

• adds: set_multi, get_multi, add_multi

– Optimistic caching

– Very stable, robust and specialized

Memcacheg

A more persistent in-memory cache

Page 21: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

21

The App Engine Datastore

Persistent storage

http://labs.google.com/papers/bigtable.html

Page 22: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

22

The App Engine Datastore

– Your data is already partitioned on day one

• Use Entity Groups

– Explicit Indexes make for fast reads

• But slower writes

– Replicated and fault tolerant

• On commit: ≥3 machines

• Geographically distributed (shortly thereafter)

– Bonus: Keep globally unique IDs for free

Persistent storage

(more about it later)

Page 23: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

23

• GMail

• Google Accounts

• Picasaweb

Other APIs

• Gadget API

• Task Queue

• Google Talk

Page 24: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

24 24

The App Engine Datastore

Page 25: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

25 25

The Datastore Is...

• Transactional

• Natively Partitioned

• Hierarchical

• Schema-less

• Based on Bigtable

• Not a relational database

• Not a SQL engine

Page 26: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

26 26

Simplifying Storage

• Simplify development of apps

• Simplify management of apps

• App Engine services build on Google’s strengths

• Scale always matters

– Request volume

– Data volume

Page 27: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

27 27

Datastore Storage Model

• Basic unit of storage is an Entity consisting of

– Kind (compare: table)

– Key (compare: primary key)

– Entity Group (compare: partition)

– 0..N typed Properties (compare: columns)

Kind Person

Entity Group /Person:Ethel

Key /Person:Ethel

Age Int64: 30

Best Friend Key:/Person:Sally

class Person(db.Model): Age = db.IntegerProperty() BestFriend = db.ReferenceProperty()

sally = Person()sally.put()ethel = Person(Age=30, BestFriend=Sally)ethel.put()

Page 28: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

28 28

Noteworthy Datastore Features

• Ancestor

• Heterogenous property types

• Multi-value properties

• Variable properties

Kind Person

Entity Group /Person:Ethel

Key /Person:Ethel/Person:Jane

Age Double: 8.5

Best Friend Key:/Person:Eloise,Key:/Person:Patty

Grade Int64: 3

Kind Person

Entity Group /Person:Ethel

Key /Person:Ethel

Age Int64: 30

Best Friend Key:/Person:Sally

class Person(db.Expando): BestFriend = db.ListProperty(db.Key)

ethel = Person(BestFriend=[sally])ethel.Age = 30ethel.put()jane = Person(BestFriend=[eloise,patty], parent=ethel)jane.Age = 8.5jane.Grade = 3jane.put()

Same entity group

Page 29: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

29 29

Datastore Transactions

• Transactions apply to a single Entity Group

– Watch out for contention!

– Global transactions are feasible

• get(), put(), delete() are transactional

• Queries cannot participate in transactions (yet)

/Person:Ethel/Person:Jane

/Person:Ethel

/Person:Max

Transaction

Page 30: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

30 30

Transparent Entity Group Management

• Entity Group layout is important (* probably the most crucial design decision of your application *)

– Write throughput (1 – 10 writes/second in an entity group)

– Atomicity of updates

• Object relationships can be described as “owned” or “unowned”

– owned: an entity doesn't make sense without another entity, e.g., chapter and book.

– unowned: otherwise, e.g., car and person.

• We let ownership imply co-location within an Entity Group

Page 31: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

31 31

Under the covers of the

App Engine Datastore

Slides are at

http://snarfed.org/space/datastore_talk.html

Page 32: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

32 32

Porting: check your Transactions

• Identify “roots” in your data model

– User is often a good choice for online services

• Identify operations that transact on multiple roots

• Analyze the impact of partial success and then either

– refactor

– disable the transaction

– disable the transaction and write compensating logic

Page 33: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

33 33

Porting: check your queries

• Shift processing from reads to writes

• Identify joins

– Denormalize or rewrite as multiple queries

• Identify unsupported filter operations (distinct, toUpper())

– Rewrite as multiple queries

– Filter in-memory

select * from PERSON p, ADDRESS a where a.person_id = p.id and p.age > 25 and a.country = “US”

select from com.example.Person where age > 25 and country = “US”

Relational Database

App Engine Datastore

Page 34: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

34

Open For Questions

– The White House's "Open For Questions" application accepted 100K questions and 3.6M votes in under 48 hours

Page 35: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

35 35

Homework #2

• Implement the paxos algorithm

– Assume that disk never fails, but processes may die (TA kills it) and restart a random time later, and communications may randomly be duplicated (TA sends it again) or lost (OS glitch).

• Start: $ paxos -p port config

– Config file lists ports of all processes (find N from it).

• Interfaces:

– “MASTER”: master location query, return “MASTER port” or “UNKONWN”.

– “WRITE name value”:

• to master: make value the consensus for name, return“WRITE OK”, “WRITE FAILED”, or “TRY AGAIN”.

• to non-master: return “MASTER port”

– “QUERY name”: return “VALUE name value” or “MASTER port” if the process died and missed the round for name.

Page 36: Lecture 6 – Cloud Application Development, using Google ... · Lecture 6 – Cloud Application Development, using Google App Engine as an example 922EU3870 – Cloud Computing and

36 36

Homework #2 (cont.)

• Functional specs

– Elect a master among processes before the master can take external values to propose.

– An acceptor waits randomly 0.5 - 1 second before replying to a message.

– Message timeout: configurable with config file, default 10s.

– Every process keeps at least 2 log files for crash recovery: proposer.{port}.log, acceptor.{port}.log, and a value.{port}.log for consensus of names and values.