49
A Quick Introduction to Google's Cloud Technologies Chris Schalk Developer Advocate @cschalk

Quick Intro to Google Cloud Technologies

Embed Size (px)

DESCRIPTION

This is the "Lightning Presentation" given at DreamForce 2011 on Google's Cloud Technologies. It covers, App Engine, Google Storage and BigQuery. #df11

Citation preview

Page 1: Quick Intro to Google Cloud Technologies

A Quick Introduction to Google's Cloud TechnologiesChris SchalkDeveloper Advocate

@cschalk

Page 2: Quick Intro to Google Cloud Technologies

Agenda

● Introduction

● Introduction to Google's Cloud Technologies

● App Engine Review

● Google's new Cloud Technologies○ Google Storage○ Prediction API○ BigQuery

● Summary Q&A

Page 3: Quick Intro to Google Cloud Technologies

Google's Cloud Technologies

Google BigQuery

Google Prediction API

Google Storage

Google App Engine

Page 4: Quick Intro to Google Cloud Technologies

A quick look at App Engine...

Google App Engine

Page 5: Quick Intro to Google Cloud Technologies

Cloud Development in a Box

● Downloadable SDK● Application runtimes

○ Java, Python● Local development tools

○ Eclipse plugin, AppEngine Launcher

● Specialized application services

● Cloud based dashboard● Ready to scale ● Built in fault tolerance, load

balancing

Page 6: Quick Intro to Google Cloud Technologies

Quick App Engine Demo!

Page 7: Quick Intro to Google Cloud Technologies

Specialized Services

BlobstoreImages

Mail XMPP Task Queue

Memcache Datastore URL Fetch

User Service

But, is that it?

Page 8: Quick Intro to Google Cloud Technologies

Now App Engine has access to all the new Google Cloud Services...

No!

Page 9: Quick Intro to Google Cloud Technologies

Google's new Cloud Technologies

Page 10: Quick Intro to Google Cloud Technologies

New Google Cloud Technologies

● Google Storage○ Store your data in Google's cloud

● Prediction API○ Google's machine learning tech in an API

● BigQuery○ Hi-speed data analysis on massive scale

● ...

Page 11: Quick Intro to Google Cloud Technologies

Google Storage for DevelopersStore your data in Google's cloud

Page 12: Quick Intro to Google Cloud Technologies

What Is Google Storage?

● Store your data in Google's cloud○ any format, any amount, any time

● You control access to your data○ private, shared, or public

● Access via Google APIs or 3rd party tools/libraries

Page 13: Quick Intro to Google Cloud Technologies

Sample Use Cases

Static content hostinge.g. static html, images, music, video Backup and recoverye.g. personal data, business records Sharinge.g. share data with your customers Data storage for applicationse.g. used as storage backend for Android, AppEngine, Cloud based apps Storage for Computatione.g. BigQuery, Prediction API

Page 14: Quick Intro to Google Cloud Technologies

Google Storage Benefits

High Performance and Scalability Backed by Google infrastructure

Strong Security and Privacy Control access to your data

Easy to UseGet started fast with Google & 3rd party tools

Page 15: Quick Intro to Google Cloud Technologies

Google Storage Technical Details

● RESTful API ○ Verbs: GET, PUT, POST, HEAD, DELETE ○ Resources: identified by URI○ Compatible with S3

● Buckets ○ Flat containers

● Objects

○ Any type○ Size: 100 GB / object

● Access Control for Google Accounts ○ For individuals and groups

● Two Ways to Authenticate Requests ○ Sign request using access keys ○ Web browser login

Page 16: Quick Intro to Google Cloud Technologies

Demo

● Tools:○ GSUtil○ GS Manager

● Upload / Download

Page 17: Quick Intro to Google Cloud Technologies

Google Storage usage within Google

Haiti Relief Imagery USPTO data

Partner Reporting

Google BigQuery

Google Prediction API

Partner Reporting

Page 18: Quick Intro to Google Cloud Technologies

Some Early Google Storage Adopters

Page 19: Quick Intro to Google Cloud Technologies

Google Storage - Pricing○ Free trial quota until Dec 31, 2011

■ For first project■ 5 GB of storage■ 25 GB download/upload data

■ 20 GB to Americas/EMEA, 5GB APAC■ 25K GET, HEAD requests■ 2,5K PUT, POST, LIST* requests

○ Production Storage■ $0.17/GB/Month (Location US, EU)■ Upload - $0.10/GB■ Download

■ $0.15/GB Americas / EMEA■ $0.30/GB APAC

■ Requests■ PUT, POST, LIST - $0.01 / 1000 Requests■ GET, HEAD - $0.01 / 10,000 Requests

■ Up to 99.9% SLA

Page 20: Quick Intro to Google Cloud Technologies

Google Storage Summary

● Store any kind of data using Google's cloud infrastructure

● Easy to Use APIs

● Many available tools and libraries○ gsutil, GS Manager○ 3rd party:

■ Boto, CloudBerry, CyberDuck, JetS3t, and more

Page 21: Quick Intro to Google Cloud Technologies

Google Prediction APIGoogle's prediction engine in the cloud

Page 22: Quick Intro to Google Cloud Technologies

Google Prediction API as a simple example

Predicts outcomes based on 'learned' patterns

Page 23: Quick Intro to Google Cloud Technologies

How does it work?

"english" The quick brown fox jumped over the lazy dog.

"english" To err is human, but to really foul things up you need a computer.

"spanish" No hay mal que por bien no venga.

"spanish" La tercera es la vencida.

? To be or not to be, that is the question.

? La fe mueve montañas.

The Prediction APIfinds relevantfeatures in the sample data during training.

The Prediction APIlater searches forthose featuresduring prediction.

Page 24: Quick Intro to Google Cloud Technologies

A virtually endless number of applications...

CustomerSentiment

TransactionRisk

SpeciesIdentification

MessageRouting

Legal DocketClassification

SuspiciousActivity

Work RosterAssignment

RecommendProducts

PoliticalBias

UpliftMarketing

EmailFiltering

Diagnostics

InappropriateContent

CareerCounselling

ChurnPrediction

... and many more ...

Page 25: Quick Intro to Google Cloud Technologies

Using the Prediction API

1. Upload

2. Train

Upload your training data toGoogle Storage

Build a model from your data

Make new predictions3. Predict

A simple three step process...

Page 26: Quick Intro to Google Cloud Technologies

Step 1: UploadUpload your training data to Google Storage

● Training data: outputs and input features ● Data format: comma separated value format (CSV)

"english","To err is human, but to really ...""spanish","No hay mal que por bien no venga."...

Upload to Google Storagegsutil cp ${data} gs://yourbucket/${data}

Page 27: Quick Intro to Google Cloud Technologies

Step 2: TrainCreate a new model by training on data

To train a model:

POST prediction/v1.3/training{"id":"mybucket/mydata"}Training runs asynchronously. To see if it has finished:

GET prediction/v1.3/training/mybucket%2Fmydata

{"kind": "prediction#training",...,"training status": "DONE"}

Page 28: Quick Intro to Google Cloud Technologies

Step 3: PredictApply the trained model to make predictions on new data

POST prediction/v1.3/training/mybucket%2Fmydata/predict{ "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}}

Page 29: Quick Intro to Google Cloud Technologies

Step 3: PredictApply the trained model to make predictions on new data

POST prediction/v1.3/training/bucket%2Fdata/predict

{ "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}}

{ data : { "kind" : "prediction#output", "outputLabel":"French", "outputMulti" :[ {"label":"French", "score": x.xx} {"label":"English", "score": x.xx} {"label":"Spanish", "score": x.xx}]}}

Page 30: Quick Intro to Google Cloud Technologies

Step 3: PredictApply the trained model to make predictions on new data

import httplib

header = {"Content-Type" : "application/json"}#...put new data in JSON format in params variableconn = httplib.HTTPConnection("www.googleapis.com")conn.request("POST", "/prediction/v1.3/query/bucket%2Fdata/predict", params, header)print conn.getresponse()

Page 31: Quick Intro to Google Cloud Technologies

Google Prediction API Demo!

● Command line Demos ○ Training a model○ Checking training status○ Making predictions

● A complete Web application using the JavaScript API for Prediction

Page 32: Quick Intro to Google Cloud Technologies

Prediction API Capabilities

Data● Input Features: numeric or unstructured text● Output: up to hundreds of discrete categories

Training● Many machine learning techniques● Automatically selected ● Performed asynchronously

Access from many platforms:● Web app from Google App Engine● Apps Script (e.g. from Google Spreadsheet)● Desktop app

Page 33: Quick Intro to Google Cloud Technologies

Prediction API - key features

● Multi-category prediction○ Tag entry with multiple labels

● Continuous Output○ Finer grained prediction rankings based on multiple labels

● Mixed Inputs○ Both numeric and text inputs are now supported

Can combine continuous output with mixed inputs

Page 34: Quick Intro to Google Cloud Technologies

Google BigQueryInteractive analysis of large datasets in Google's cloud

Page 35: Quick Intro to Google Cloud Technologies

Introducing Google BigQuery

● Google's large data adhoc analysis technology○ Analyze massive amounts of data in seconds

● Simple SQL-like query language

● Flexible access○ REST APIs, JSON-RPC, Google Apps Script

Page 36: Quick Intro to Google Cloud Technologies

Why BigQuery?

Working with large data is a challenge

Page 37: Quick Intro to Google Cloud Technologies

Many Use Cases ...

Spam TrendsDetection

Web Dashboards

Network Optimization

Interactive Tools

Page 38: Quick Intro to Google Cloud Technologies

Key Capabilities of BigQuery

● Scalable: Billions of rows

● Fast: Response in seconds

● Simple: Queries in SQL

● Web Service○ REST○ JSON-RPC○ Google App Scripts

Page 39: Quick Intro to Google Cloud Technologies

Using BigQuery

1. Upload

2. Import

Upload your raw data toGoogle Storage

Import raw data into BigQuery table

Perform SQL queries on table

3. Query

Another simple three step process...

Page 40: Quick Intro to Google Cloud Technologies

Writing Queries

Compact subset of SQL○ SELECT ... FROM ...

WHERE ... GROUP BY ... ORDER BY ...LIMIT ...;

Common functions○ Math, String, Time, ...

Statistical approximations○ TOP○ COUNT DISTINCT

Page 41: Quick Intro to Google Cloud Technologies

BigQuery via REST

GET /bigquery/v1/tables/{table name}

GET /bigquery/v1/query?q={query}

Sample JSON Reply:{ "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] }}Also supports JSON-RPC

Page 42: Quick Intro to Google Cloud Technologies

Security and Privacy

Standard Google Authentication● Client Login● OAuth● AuthSub

HTTPS support● protects your credentials● protects your data

Relies on Google Storage to manage access

Page 43: Quick Intro to Google Cloud Technologies

Large Data Analysis Example

Wikimedia Revision history data from: http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z

Wikimedia Revision History

Page 44: Quick Intro to Google Cloud Technologies

Using BigQuery ShellPython DB API 2.0 + B. Clapper's sqlcmdhttp://www.clapper.org/software/python/sqlcmd/

Page 45: Quick Intro to Google Cloud Technologies

BigQuery from a Spreadsheet

Page 46: Quick Intro to Google Cloud Technologies

BigQuery from a Spreadsheet

Page 47: Quick Intro to Google Cloud Technologies

Recap

● Google App Engine○ Application development platform for the

cloud

● Google Storage○ High speed cloud data storage on Google's

infrastructure

● Prediction API○ Google's machine learning technology able to

predict outcomes based on sample data

● BigQuery○ Interactive analysis of very large data sets○ Simple SQL query language access

Page 48: Quick Intro to Google Cloud Technologies

Further info available at:

● Google App Engine○ http://code.google.com/appengine

● Google Storage for Developers○ http://code.google.com/apis/storage

● Prediction API○ http://code.google.com/apis/predict

● BigQuery○ http://code.google.com/apis/bigquery

Page 49: Quick Intro to Google Cloud Technologies

Thank you!

Questions?● @cschalk