42
Google confidential │ Do not distribute Google confidential │ Do not distribute MongoDB on Google Cloud Platform Tom Grey EMEA Head of Cloud Platform Solutions Engineering at Google Jorge Salamero A Technical Overview

MongoDB Days UK: Run MongoDB on Google Cloud Platform

  • Upload
    mongodb

  • View
    1.016

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

MongoDB on Google Cloud Platform

Tom GreyEMEA Head of Cloud Platform Solutions Engineering at Google

Jorge SalameroChief Evangelist at Server Density

A Technical Overview

Page 2: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Everybody is talking about cloud...

Google Confidential & Proprietary

Page 3: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Making Google look easy is hard...

Google Confidential & Proprietary

Page 4: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Powering google.com...

Google Confidential & Proprietary

Google is the fourth largest server manufacturer in the world after Dell, HP, and IBM according to Martin Reynolds of Garter Group

Page 5: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Powering google.com...

Google Confidential & Proprietary

If Google were an ISP it would be the second largest ISP by traffic on the planet according to Arbor Networks

Page 6: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Powering google.com...

Google Confidential & Proprietary

Customised hardware built from cheap commodity partsSoftware resilience and easy repair not hardware resilienceHorizontal layers not vertical stacksVast numbers of homogeneous servers managed at scale

Page 7: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

Externalising Google

Omega

Application Runtimes & Services● Iterate & deploy fast

● Scale to global demand● Standards compliant

Data Services● Data Intelligence● Designed for Big Data● High Performance Map

Reduce Dremel Pregel Percolator

Data Storage and Distribution● Global Resilient Architecture● Global Edge Distribution● Huge Secure Capacity GFS Colossus Spanner BigTable F1

Global Data Centre & Networks● Highly Resilient, Efficient & Performant● 3rd Largest Server Manufacturer● 2nd Largest Global Data Network

Google Research Publications referenced are available here: http://research.google.com/pubs/papers.html

Google Products etc...

App EngineCloud Endpoints

BigQuery

Cloud StorageCloud SQL

Cloud DataStore

Compute Engine

Your Applications

Externalising Google

Page 8: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

Storage

ComputeOps

Big Data

Network

The Google Cloud Platform

Page 9: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

The Google Cloud Platform

Google Cloud Storage Google Cloud SQL Google Cloud Datastore Google Cloud Bigtable

ComputeOps

Big Data

Network

Page 10: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

The Google Cloud Platform

Google Cloud Storage Google Cloud SQL Google Cloud Datastore Google Cloud Bigtable

Google Compute Engine

Google App Engine

Google Container Engine

OpsBig Data

Network

Page 11: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

The Google Cloud Platform

Google Cloud Storage Google Cloud SQL Google Cloud Datastore Google Cloud Bigtable

Google Compute Engine

Google App Engine

Google Container Engine

Google BigQuery

Google Cloud Dataflow

Google Cloud Datalab

Google Cloud Pub/Sub

Google Cloud Dataproc

Ops

Network

Page 12: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

The Google Cloud Platform

Google Cloud Storage Google Cloud SQL Google Cloud Datastore Google Cloud Bigtable

Google Compute Engine

Google App Engine

Google Container Engine

Google BigQuery

Google Cloud Dataflow

Google Cloud Datalab

Google Cloud Pub/Sub

Google Cloud Dataproc

Google Cloud Networking

Ops

Page 13: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

The Google Cloud Platform

Google Cloud Storage Google Cloud SQL Google Cloud Datastore Google Cloud Bigtable

Google Compute Engine

Google App Engine

Google Container Engine

Google BigQuery

Google Cloud Dataflow

Google Cloud Datalab

Google Cloud Pub/Sub

Google Cloud Networking

Google Cloud Monitoring

Google Cloud Logging

Google Cloud Dataproc

Page 14: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

The Google Cloud Platform

Google Cloud Storage Google Cloud SQL Google Cloud Datastore Google Cloud Bigtable

Google Compute Engine

Google App Engine

Google Container Engine

Google BigQuery

Google Cloud Dataflow

Google Cloud Datalab

Google Cloud Pub/Sub

Google Cloud Networking

Google Cloud Monitoring

Google Cloud Logging

Google Cloud Dataproc

Page 15: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

API

Project

Anatomy of a Compute Engine project

Page 16: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

CLI

UI

Code API

Project

Anatomy of a Compute Engine project

Page 17: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

CLI

UI

Code API VMVMVM

Project

Anatomy of a Compute Engine project

Page 18: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

CLI

UI

Code API

Persistent

DiskCloud

Storage

VMVMVM

Project

Anatomy of a Compute Engine project

Page 19: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

PrivateNetwork

CLI

UI

Code API

Persistent

DiskCloud

Storage

VMVMVM

Project

Anatomy of a Compute Engine project

Page 20: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

Internet

PrivateNetwork

CLI

UI

Code API

Persistent

DiskCloud

Storage

VMVMVM

Project

Anatomy of a Compute Engine project

Page 21: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

Region

Regions:Geographic location of resourcesInter-Region Latency > Zone Group Latency

Zones:Independent of other ZonesIsolated within a RegionDistribute instances across Zones to protect against single zone system

failure.

Page 22: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google confidential │ Do not distribute Google confidential │ Do not distribute

Machine Types

Page 23: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

MongoDB on Google Compute Engine

+our experience @

Page 24: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Page 25: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

cloud infrastructure monitoring

Monitoring as-a-Service

servers & services 30+ integration

custom plugins + API

dashboard + alerts

Page 26: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

How we use MongoDB?

Time series databaseSince 2009250+TB/month6000 writes/sec500,000,000 new documents per day

Ubuntu 12.04 LTSBare metal serversSSD disksPuppet Forge MongoDB module

Page 27: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

MongoDB on Cloud Servers

Traditionally MongoDB on VM issuesCPU steal from other guests

(no high CPU requirements itself)

Disk IO

Google Compute intelligent throttlingno more noisy neighboursprovisioned IOPSpredefined instances types

Page 28: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Google Compute Engine Disks

Standard Persistent Disk (storage backed by hard disk drives)SSD PD (storage backed by solid state drives)LOCAL SSD (not persistent, obviously)

Standard: sustained performance increases with size + burst for peaks100 GB: 30 random read IOPS or 150 random write IOPS (12 MB/s for reads and 9 MB/s for writes)

10 TB: 3000 random read IOPS or 15000 random write IOPS (180 MB/s for reads and 120 MB/s for writes)

● SSD: IOPS increase faster, throughput the samemax 10000 random read IOPS at 333 GBmax 15000 random write IOPs at 500 GB

● Network egress cap: redundancy 3.3 x IOPS (2Gbps / CPU)

Page 29: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Google Compute Engine Disks: example

SATA2 7200RPM ~75 IOPS / 120MB/s

IO Pattern SSD PD size required

75 random reads 250 GB

75 random writes 50 GB

120 MB/s read 1000 GB

120 MB/s writes 1333 GB

Page 30: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Google Compute Engine Disks: local SSD

375 GB, up to 4 per VMmany limitations: no redundancy, no snapshots, create-time onlyNVMe or SCSI

Standard PD SSD PD Local SSD NVMe

Read IOPS per GB 0.3 30 453.3

Write IOPS per GB 1.5 30 240

Read IOPS per instance 3,000 10,000 680,000

Write IOPS per instance 15,000 15,000 360,000

Page 31: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

So, remember

In Google Compute, IOPS scale linearly with volume size

Page 32: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Dimensioning and tuning

FS: DICARD/TRIM, lazy init (Google takes care)

Low readahead (blockdev)

IO queue depth:1 each 400-800 IOPSmax 64↑ depth ↑ IOPS but ↑ latency

1 CPU each 2000 read IOPS / 2500 write IOPS

Page 33: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Recommended configuration

dbpath separated volumejournal on a different volume

likely to be big volume for IOPS, low usage~ at least 200GB ~6000 write IOPS

cannot use snapshots for backups (fsync lock or shutdown required)

directoryperdb for each db(locking is managed at database level)flexibility, performance independent

Page 34: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Our tests

3 different performance scenarios:

1. no extra disks (default 10GB volume)2. dedicated dbpath (200GB volume)3. dedicated dbpath (200GB volume) + dedicated journal (200GB

volume)

2 different nodes:

4. n1-standard-2 (2 vCPUs and 7.5GB RAM)

5. n1-highmem-8 (8 vCPUs and 52GB RAM)

Page 35: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

The results on n1-standard-2

Page 36: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

The results on n1-highmem-8

Page 37: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Conclusions

No much difference until you start to acknowledge writesDifference between instances is small, real-life test needed

Validated the recommendations, again:separate your dbpathseparate your journal

Full details:https://blog.serverdensity.com/mongodb-on-google-compute-engine-tips-and-benchmarks/

Page 38: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

How we use MongoDB on Google Compute?

MongoDB Cloud Managerwith backup verification in GC

real-time offsite backups(only a few seconds behind)

replica node for each replica setcopy of every write operationsustained traffic 42Mbpspoint-in-time restores

Page 39: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

How we use MongoDB on Google Compute?

Off-site backups into GC

API to trigger restore job: get a tarballDownloaded to Google Cloud storage

versioning includedregional buckets for redundancy

USA + EU

Page 40: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

How we use MongoDB on Google Compute?

Restore on MongoDB GC

Launch instance with SSD PDgsutil to download the tarballuntar the backupinstall MongoDB

● COMPARE with PRODUCTION● whole process: ~10 min, twice a day● Python, Buildbot, notifications on HipChat

Page 41: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Google Confidential and Proprietary

Thanks!

Page 42: MongoDB Days UK: Run MongoDB on Google Cloud Platform

Q&A

Confidential | Do not distribute