35
Getting Started on Google Cloud Platform Aaron Taylor @ataylor0123

Getting Started on Google Cloud Platform

Embed Size (px)

Citation preview

Page 1: Getting Started on Google Cloud Platform

Getting Started on Google Cloud Platform

Aaron Taylor

@ataylor0123

Page 2: Getting Started on Google Cloud Platform

access any file in seconds, wherever it is.

www.meta.sc

Page 3: Getting Started on Google Cloud Platform
Page 4: Getting Started on Google Cloud Platform

Folders are outdated

Page 5: Getting Started on Google Cloud Platform

Files are scattered

Page 6: Getting Started on Google Cloud Platform
Page 7: Getting Started on Google Cloud Platform

Talk Roadmap

• What problems we face at Meta

• How we are solving them using GCP

• How you can get started on GCP

Page 8: Getting Started on Google Cloud Platform

Building a product

• No baggage, free to choose whatever stack we want

• Take advantage of latest technologies

• but not quite bleeding edge

Page 9: Getting Started on Google Cloud Platform

Engineering Goals

• This will be a complex product, it needs to be comprehensible to everyone on our team

• Keep the team as lean as possible

• Focus on product, not sysadmin and dev ops

Page 10: Getting Started on Google Cloud Platform

Language Choices

• Go chosen as our primary language

• Python for NLP and data analysis

• enables easy experimentation, comfortable for data scientists and developers

• Java/Scala interacting with Dataflow, Apache Tika, etc.

Page 11: Getting Started on Google Cloud Platform

Our Hard Problems

• User onboarding load

• Heterogeneous (changing) data sources

• Unpredictable traffic from web hooks

• Compute loads for file content analysis

• Processing streaming data

Page 12: Getting Started on Google Cloud Platform

User Onboarding

• Crawl multiple cloud accounts at once

• Parallel computation

• In-process using Go

• Distributed using tasks• App Engine

Taskqueues

Page 13: Getting Started on Google Cloud Platform

Heterogeneous Data

• Remove complexity of third-party services

• Detect changes/breakages in APIs

• Distributed by nature

• Continuous Deployment

• Datastore

• BigQuery

Page 14: Getting Started on Google Cloud Platform

Unpredictable Traffic

• Changes are pushed to us through web hooks

• Dropping changes generally unacceptable

• One user should not negatively impact others

• App Engine autoscaling

• Asynchronous task queues

Page 15: Getting Started on Google Cloud Platform

Compute loads• Rich file content analysis

• Parallel computation

• App Engine Flexible Runtimes

• CPU-based autoscaling

Page 16: Getting Started on Google Cloud Platform

Stream Processing• Efficient handling of

high-volume changes

• Collate events in succession, from multiple users

• Google Cloud Pub/Sub

• Google Cloud Dataflow

Page 17: Getting Started on Google Cloud Platform

How we started off

• App Engine is our entry point

• Service Oriented Architecture

• Currently ~37 different services

• Cloud Datastore is our persistence layer

• BigQuery as a data warehouse

Page 18: Getting Started on Google Cloud Platform

Documentation

• Lots of information for getting started

• Quality resources for our growing team

• Onboarding new developers without GCP experience has been a breeze

• Google is devoting lots of resources to this area

Page 19: Getting Started on Google Cloud Platform

App Engine

• Don’t worry about servers

• Cache, task queues, cron, database, logging, monitoring, and more all built in

• Powerful, configurable autoscaling

• Heavy compute on App Engine Flexible Runtimes

Page 20: Getting Started on Google Cloud Platform

Development Process

• Build, run, and test services locally

• Continuous deployment to a development project

• Incremental releases go to production project

• Logging and monitoring easy to setup

Page 21: Getting Started on Google Cloud Platform

Problems we faced• Mantra of “don’t worry about scalability” didn’t take us

very far

• Users have lots and lots of files

• Datastore use optimizations

• Cost issues with App Engine

• Trimming auto-scaling parameters

• Migrated heavy compute to Flexible Runtimes

Page 22: Getting Started on Google Cloud Platform

Outside GCP• Algolia

• Hosts infrastructure for our search indices

• Pusher

• realtime socket connections

• Postmark/Mailchimp

• transactional and campaign-based email

Page 23: Getting Started on Google Cloud Platform

Growth of the platform• Rapid changes and improvements taking place

• Flexible Runtimes

• Container Engine

• Dataflow

• Investing in a documentation overhaul soon

• Support is generally quite responsive

Page 24: Getting Started on Google Cloud Platform

Recent Developments

• Introduction of Pub/Sub to our system for all event processing

• Experimenting with Kubernetes/Container Engine

• Dataflow stream processing jobs

• Splitting functionality into multiple projects

Page 25: Getting Started on Google Cloud Platform

Quickstart Documentation for Go

How you can start off

Page 26: Getting Started on Google Cloud Platform

Hello World in Go

https://cloud.google.com/appengine/docs/go/quickstart

Page 27: Getting Started on Google Cloud Platform

Server

package hello

import ( "fmt" "net/http" )

func init() { http.HandleFunc("/", handler) }

func handler(w http.ResponseWriter, r *http.Request) { fmt.Fprint(w, "Hello, world!") }

hello.go

Page 28: Getting Started on Google Cloud Platform

Configuration

runtime: go api_version: go1

handlers: - url: /.* script: _go_app

app.yaml

Page 29: Getting Started on Google Cloud Platform

Deploy

appcfy.py update .

Page 30: Getting Started on Google Cloud Platform

Add a Guestbook

https://cloud.google.com/appengine/docs/go/gettingstarted/creating-guestbook

Page 31: Getting Started on Google Cloud Platform

Datastoretype Greeting struct { Author string Content string Date time.Time }

// guestbookKey returns the key used for all guestbook entries. func guestbookKey(c appengine.Context) *datastore.Key { // The string "default_guestbook" here could be varied to have multiple guestbooks. return datastore.NewKey(c, "Guestbook", "default_guestbook", 0, nil) }

func root(w http.ResponseWriter, r *http.Request) { c := appengine.NewContext(r)

// Ancestor queries, as shown here, are strongly consistent with the High // Replication Datastore. Queries that span entity groups are eventually // consistent. If we omitted the .Ancestor from this query there would be // a slight chance that Greeting that had just been written would not // show up in a query. q := datastore.NewQuery("Greeting").Ancestor(guestbookKey(c)).Order("-Date").Limit(10)

greetings := make([]Greeting, 0, 10) if _, err := q.GetAll(c, &greetings); err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return }

if err := guestbookTemplate.Execute(w, greetings); err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) } }

Page 32: Getting Started on Google Cloud Platform

Templates

var guestbookTemplate = template.Must(template.New("book").Parse(` <html> <head> <title>Go Guestbook</title> </head> <body> {{range .}} {{with .Author}} <p><b>{{.}}</b> wrote:</p> {{else}} <p>An anonymous person wrote:</p> {{end}} <pre>{{.Content}}</pre> {{end}} <form action="/sign" method="post"> <div><textarea name="content" rows="3" cols="60"></textarea></div> <div><input type="submit" value="Sign Guestbook"></div> </form> </body> </html> `))

Page 33: Getting Started on Google Cloud Platform

Formsfunc sign(w http.ResponseWriter, r *http.Request) { c := appengine.NewContext(r) g := Greeting{ Content: r.FormValue("content"), Date: time.Now(), }

if u := user.Current(c); u != nil { g.Author = u.String() } // We set the same parent key on every Greeting entity to ensure each Greeting // is in the same entity group. Queries across the single entity group // will be consistent. However, the write rate to a single entity group // should be limited to ~1/second. key := datastore.NewIncompleteKey(c, "Greeting", guestbookKey(c)) _, err := datastore.Put(c, key, &g) if err != nil { http.Error(w, err.Error(), http.StatusInternalServerError) return } http.Redirect(w, r, "/", http.StatusFound) }

Page 34: Getting Started on Google Cloud Platform

Conclusions

• Google Cloud Platform has allowed us to build out Meta in ways that wouldn’t otherwise be feasible

• Simplicity of App Engine allows us to focus on product

• Scalability/Availability are built in to the platform

Page 35: Getting Started on Google Cloud Platform

access any file in seconds, wherever it is.

www.meta.sc/careers

[email protected]