Upload
simon-willison
View
25.294
Download
1
Embed Size (px)
DESCRIPTION
Slides from an introduction to Google App Engine presented on the 31st of May 2008 at BarCamp London 4.
Citation preview
I’ve (probably) been using
Google App Enginefor a week longer than you have
Simon Willison - http://simonwillison.net/BarCamp London 431st May 2008
Except you have to re-writeyour whole application
If you totally rethink the way you use a database
What it can do
• Serve static files
• Serve dynamic requests
• Store data
• Call web services (sort of)
• Authenticate against Google’s user database
• Send e-mail, process images, use memcache
The dev environment
• Download the (open source) SDK
• a full simulation of the App Engine environment
• dev_appserver.py myapp for a local webserver
• appcfg.py update myapp to deploy to the cloud
is really, really nice
Options
• You have to use Python
• You can choose how you use it:
• CGI-style scripts
• WSGI applications
• Google’s webapp framework
• Django (0.96 provided, or install your own)
Hello World# helloworld.pyprint "Content-Type: text/html"printprint "Hello, world!"
# app.yamlapplication: simonwillison-helloworldversion: 1runtime: pythonapi_version: 1
handlers:- url: /.* script: helloworld.py
With webapp and WSGIimport wsgiref.handlersfrom google.appengine.ext import webapp
class MainPage(webapp.RequestHandler): def get(self): self.response.headers['Content-Type'] = 'text/html' self.response.out.write('Hello, webapp World!')
def main(): application = webapp.WSGIApplication( [('/', MainPage)], debug=True) wsgiref.handlers.CGIHandler().run(application)
if __name__ == "__main__": main()
With Django
from django.conf.urls.defaults import *from django.http import HttpResponse
def hello(request): return HttpResponse("Hello, World!")
urlpatterns = patterns('', ('^$', hello),)
(And django_dispatch.py for boilerplate)
• Don't use CGI: it requires reloading for every hit
• Why use Django over webapp?
• Django has easy cookies and custom 500 errors
• Django is less verbose
• Django middleware is really handy
• You can use other WSGI frameworks if you like
# in app.yaml
handlers:- url: /css static_dir: css- url: /img static_dir: img- url: /favicon.ico static_files: img/favicon.ico upload: img/favicon.ico mime_type: image/x-icon
Static files
The Datastore API
“Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of
data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google
Earth, and Google Finance.”
The App Engine datastore
• Apparently based on BigTable
• Absolutely not a relational database
• No joins (they do have “reference fields”)
• No aggregate queries - not even count()!
• Hierarchy affects sharding and transactions
• All queries must run against an existing index
Models and entities
• Data is stored as entities
• Entities have properties - key/value pairs
• An entity has a unique key
• Entities live in a hierarchy, and siblings exist in the same entity group - these are actually really important for transactions and performance
• A model is kind of like a class; it lets you define a type of entity
AppEngine Models
from google.appengine.ext import db
class Account(db.Model): slug = db.StringProperty(required=True) owner = db.UserProperty() onlyme = db.BooleanProperty() referrers = db.StringListProperty()
(There is a ReferenceProperty, but I haven’t used it yet)
Inserting dataaccount = Account( key_name = slug, slug = slug, referrers = ['...', '...'], onlyme = False, owner = users.get_current_user())db.put(account) # Or account.put()
Browser.get_or_insert(key_name, parent = account, slug = browser_slug)
Running queries
Browser.all().ancestor(account)
Account.gql("WHERE slug = :1", slug))
Story.all().filter( 'title =', 'Foo').order('-date')
BUT...• All queries must run against an existing index
• Filtering or sorting on a property requires that the property exists
• Inequality filters are allowed on one property only
• Properties in inequality filters must be sorted before other sort orders
• ... and various other rules
• Thankfully the dev server creates most indexes for you automatically based on usage
How indexes are used1. The datastore identifies the index that
corresponds with the query’s kind, filter properties, filter operators, and sort orders.
2. The datastore starts scanning the index at the first entity that meets all of the filter conditions using the query’s filter values.
3. The datastore continues to scan the index, returning each entity, until it finds the next entity that does not meet the filter conditions, or until it reaches the end of the index.
Further limitations
• If you create a new index and push it live, you have to wait for it to rebuilt
• This can take hours, and apparently can go wrong
• You can’t safely grab more than about 500 records at once - App Engine times out
• You can’t delete in bulk
Other random notes
• You have to use the URL Fetch API to do HTTP requests (e.g. for web services) - and it times out agressively at about 5 seconds
• The Google accounts Users API is ridiculously easy to use, but...
• no permanent unique identifier; if the user changes their e-mail address you’re screwed
• The new image and memcache APIs are neat
Final thoughts
• It’s really nice not to have to worry about hosting
• But... the lack of aggregate queries and ad-hoc queries really hurts
• Perfect for small projects you don’t want to worry about and big things which you’re sure will have to scale
• Pricing is comparable to S3 - i.e. stupidly cheap
Pricing
• $0.10 - $0.12 per CPU core-hour
• $0.15 - $0.18 per GB-month of storage
• $0.11 - $0.13 per GB outgoing bandwidth
• $0.09 - $0.11 per GB incoming bandwidth
Thank you!