70
Painless Data Storage with MongoDB & Go

Painless Data Storage with MongoDB & Go

Embed Size (px)

DESCRIPTION

This presentation will give developers an introduction and practical experience of using MongoDB with the Go language. MongoDB Chief Developer Advocate & Gopher Steve Francia presents plainly what you need to know about using MongoDB with Go. As an emerging language Go is able to start fresh without years of relational database dependencies. Application and library developers are able to build applications using the excellent Mgo MongoDB driver and the reliable go sql package for relational database. Find out why some people claim Go and MongoDB are a “pair made in heaven” and “the best database driver they’ve ever used” in this talk by Gustavo Niemeyer, the author of the mgo driver, and Steve Francia, the drivers team lead at MongoDB Inc. We will cover: Connecting to MongoDB in various configurations Performing basic operations in Mgo Marshaling data to and from MongoDB Asynchronous & Concurrent operations Pre-fetching batches for seamless performance Using GridFS How MongoDB uses Mgo internally

Citation preview

Page 2: Painless Data Storage with MongoDB & Go

• Author of Hugo, Cobra, Viper & More

• Chief Developer Advocate for MongoDB

• Gopher

@spf13

Page 6: Painless Data Storage with MongoDB & Go

Go is Friendly• Feels like a dynamic language in many ways

• Very small core language, easy to remember all of it

• Single binary installation, no dependencies

• Extensive Tooling & StdLib

Page 7: Painless Data Storage with MongoDB & Go

Go is Concurrent• Concurrency is part of the language

• Any function can become a goroutine

• Goroutines run concurrently, communicate through channels

• Select waits for communication on any of a set of channels

Page 9: Painless Data Storage with MongoDB & Go

Why Another Database?• Databases are slow • Relational structures don’t fit well

with modern programming (ORMs)

• Databases don’t scale well

Page 10: Painless Data Storage with MongoDB & Go

MongoDB is Fast• Written in C++

• Extensive use of memory-mapped files i.e. read-through write-through memory caching.

• Runs nearly everywhere

• Data serialized as BSON (fast parsing)

• Full support for primary & secondary indexes

• Document model = less work

Page 11: Painless Data Storage with MongoDB & Go

MongoDB is Friendly• Ad Hoc queries

• Real time aggregation

• Rich query capabilities

• Traditionally consistent

• Geospatial features

• Support for most programming languages

• Flexible schema

Page 12: Painless Data Storage with MongoDB & Go

MongoDB is “Web Scale”• Built in sharding support distributes data

across many nodes

• MongoS intelligently routes to the correct nodes

• Aggregation done in parallel across nodes

Page 13: Painless Data Storage with MongoDB & Go

Document Database• Not for .PDF & .DOC files

• A document is essentially an associative array

• Document == JSON object

• Document == PHP Array

• Document == Python Dict

• Document == Ruby Hash

• etc

Page 14: Painless Data Storage with MongoDB & Go

Data Serialization• Applications need persistant data

• The process of translating data structures into a format that can be stored

• Ideal format accessible from many languages

Page 15: Painless Data Storage with MongoDB & Go

BSON• Inspired by JSON

• Cross language binary serialization format

• Optimized for scanning

• Support for richer types

Page 18: Painless Data Storage with MongoDB & Go

bob := &Person{ Name: "Bob", Birthday: time.Now(), } !data, err := bson.Marshal(bob) if err != nil { return err } fmt.Printf("Data: %q\n", data)

!var person Person err = bson.Unmarshal(data, &person) if err != nil { return err } fmt.Printf("Person: %v\n", person)

Serializing with BSON

Page 19: Painless Data Storage with MongoDB & Go

bob := &Person{ Name: "Bob", Birthday: time.Now(), } !data, err := bson.Marshal(bob) if err != nil { return err } fmt.Printf("Data: %q\n", data)

!var person Person err = bson.Unmarshal(data, &person) if err != nil { return err } fmt.Printf("Person: %v\n", person)

Serializing with BSON

Page 20: Painless Data Storage with MongoDB & Go

bob := &Person{ Name: "Bob", Birthday: time.Now(), } !data, err := bson.Marshal(bob) if err != nil { return err } fmt.Printf("Data: %q\n", data)

!var person Person err = bson.Unmarshal(data, &person) if err != nil { return err } fmt.Printf("Person: %v\n", person)

Serializing with BSON

Page 21: Painless Data Storage with MongoDB & Go

bob := &Person{ Name: "Bob", Birthday: time.Now(), } !data, err := bson.Marshal(bob) if err != nil { return err } fmt.Printf("Data: %q\n", data)

!var person Person err = bson.Unmarshal(data, &person) if err != nil { return err } fmt.Printf("Person: %v\n", person)

Serializing with BSON

Page 22: Painless Data Storage with MongoDB & Go

bob := &Person{ Name: "Bob", Birthday: time.Now(), } !data, err := bson.Marshal(bob) if err != nil { return err } fmt.Printf("Data: %q\n", data)

!var person Person err = bson.Unmarshal(data, &person) if err != nil { return err } fmt.Printf("Person: %v\n", person)

Serializing with BSON

Page 23: Painless Data Storage with MongoDB & Go

bob := &Person{ Name: "Bob", Birthday: time.Now(), } !data, err := bson.Marshal(bob) if err != nil { return err } fmt.Printf("Data: %q\n", data)

!var person Person err = bson.Unmarshal(data, &person) if err != nil { return err } fmt.Printf("Person: %v\n", person)

Serializing with BSONData: "%\x00\x00\x00\x02name\x00\x04\x00\x00\x00Bob\x00\tbirthday\x00\x80\r\x97|^\x00\x00\x00\x00"!

Person: {Bob 2014-07-21 18:00:00 -0500 EST}

Page 24: Painless Data Storage with MongoDB & Go

! type Project struct { Name string `bson:"name"` ImportPath string `bson:"importPath"` } project := Project{name, path} !!! project := map[string]string{"name": name, "importPath": path} !!! project := bson.D{{"name", name}, {"importPath", path}}

Equal After MarshalingStruct

Custom Map

Document Slice

Page 25: Painless Data Storage with MongoDB & Go

mgo (mango)• Pure Go

• Created in late 2010 ("Where do I put my Go data?")

• Adopted by Canonical and MongoDB Inc. itself

• Sponsored by MongoDB Inc. from late 2011

Page 27: Painless Data Storage with MongoDB & Go

• Same interface for server, replica set, or shard

• Driver discovers and maintains topology

• Server added/removed, failovers, response times, etc

Connectingsession, err := mgo.Dial("localhost") if err != nil { return err }

Page 28: Painless Data Storage with MongoDB & Go

• Sessions are lightweight

• Sessions are copied (settings preserved)

• Single management goroutine for all copied sessions

Sessionsfunc (s *Server) handle(w http.ResponseWriter, r *http.Request) { session := s.session.Copy() defer session.Close() // ... handle request ... }

Page 29: Painless Data Storage with MongoDB & Go

• Saves typing

• Uses the same session over and over

Convenient Accessprojects := session.DB("OSCON").C("projects")

Page 31: Painless Data Storage with MongoDB & Go

type Project struct { Name string `bson:"name,omitempty"` ImportPath string `bson:"importPath,omitempty"` }

Defining Our Own Type

Page 32: Painless Data Storage with MongoDB & Go

var projectList = []Project{ {"gocheck", "gopkg.in/check.v1"}, {"qml", "gopkg.in/qml.v0"}, {"pipe", "gopkg.in/pipe.v2"}, {"yaml", "gopkg.in/yaml.v1"}, } !for _, project := range projectList { err := projects.Insert(project) if err != nil { return err } } fmt.Println("Okay!")

Insert

Okay!

Page 33: Painless Data Storage with MongoDB & Go

type M map[string]interface{} !change := M{"$set": Project{ImportPath: "gopkg.in/qml.v1"}} !err = projects.Update(Project{Name: "qml"}, change) if err != nil { return err } !fmt.Println("Done!")

Update

Done!

Page 35: Painless Data Storage with MongoDB & Go

var project Project !err = projects.Find(Project{Name: "qml"}).One(&project) if err != nil { return err } !fmt.Printf("Project: %v\n", project)

Find

Project: {qml gopkg.in/qml.v0}

Page 36: Painless Data Storage with MongoDB & Go

iter := projects.Find(nil).Iter() !

var project Project for iter.Next(&project) { fmt.Printf("Project: %v\n", project) } !

return iter.Err()

Iterate

Project: {gocheck gopkg.in/check.v1} Project: {qml gopkg.in/qml.v0} Project: {pipe gopkg.in/pipe.v2} Project: {yaml gopkg.in/yaml.v1}

Page 37: Painless Data Storage with MongoDB & Go

m := map[string]interface{}{ "name": "godep", "tags": []string{"tool", "dependency"}, "contact": bson.M{ "name": "Keith Rarick", "email": "[email protected]", }, } !err = projects.Insert(m) if err != nil { return err } fmt.Println("Okay!")

Nesting

Okay!

Page 38: Painless Data Storage with MongoDB & Go

type Contact struct { Name string Email string } !type Project struct { Name string Tags []string `bson:",omitempty"` Contact Contact `bson:",omitempty"` } !err = projects.Find(Project{Name: "godep"}).One(&project) if err != nil { return err } !pretty.Println("Project:", project)

Nesting IIProject: main.Project{ Name: "godep", Tags: {"tool", "dependency"}, Contact: {Name:"Keith Rarick", Email:"[email protected]"}, }

Page 39: Painless Data Storage with MongoDB & Go

• Compound

• List indexing (think tag lists)

• Geospatial

• Dense or sparse

• Full-text searching

Indexing// Root field err = projects.EnsureIndexKey("name") ... !// Nested field err = projects.EnsureIndexKey("author.email") ...

Page 41: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 42: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 43: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 44: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 45: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 46: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 47: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Page 48: Painless Data Storage with MongoDB & Go

func f(projects *mgo.Collection, name string, done chan error) { var project Project err := projects.Find(Project{Name: name}).One(&project) if err == nil { fmt.Printf("Project: %v\n", project) } done <- err } !done := make(chan error) !go f(projects, "qml", done) go f(projects, "gocheck", done) !if err = firstError(2, done); err != nil { return err }

Concurrent

Project: {qml gopkg.in/qml.v1} Project: {gocheck gopkg.in/check.v1}

Page 51: Painless Data Storage with MongoDB & Go

• Loads 200 results at a time

• Loads next batch with (0.25 * 200) results left to process

Concurrent Loadingsession.SetBatch(200) session.SetPrefetch(0.25) !for iter.Next(&result) { ... }

Page 52: Painless Data Storage with MongoDB & Go

• Each Copy uses a different connection

• Closing session returns socket to the pool

• defer runs at end of function

Handler With Session Copyfunc (s *Server) handle(w http.ResponseWriter, r *http.Request) { session := s.session.Copy() defer session.Close() ! // ... handle request ... }

Page 53: Painless Data Storage with MongoDB & Go

• Shares a single connection

• Still quite efficient thanks to concurrent capabilities of go + mgo

Handler With Single Sessionfunc (s *Server) handle(w http.ResponseWriter, r *http.Request) { session := s.session ! // ... handle request ... }

Page 55: Painless Data Storage with MongoDB & Go

GridFS• Not quite a file system

• Really useful for local file storage

• A convention, not a feature

• Supported by all drivers

• Fully replicated, sharded file storage

Page 56: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

Page 57: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

Page 58: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

Page 59: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

Page 60: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

Page 61: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

Page 62: Painless Data Storage with MongoDB & Go

gridfs := session.DB("OSCON").GridFS("fs") !file, err := gridfs.Create("cd.iso") if err != nil { return err } defer file.Close() !started := time.Now() !_, err = io.Copy(file, iso) if err != nil { return err } !fmt.Printf("Wrote %d bytes in %s\n", file.Size(), time.Since(started))

GridFS

!

Wrote 470386961 bytes in 7.0s

Page 64: Painless Data Storage with MongoDB & Go

Features• Transactions (mgo/txn experiment)

• Aggregation pipelines

• Full-text search

• Geospatial support

• Hadoop

Page 70: Painless Data Storage with MongoDB & Go

• @spf13

• Author of Hugo, Cobra, Viper & More

• Chief Developer Advocate for MongoDB

• Gopher

Thank You