44
Consulting Engineer, 10gen Jason Zucchetto #MongoSF Schema Design

MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

  • Upload
    mongodb

  • View
    1.391

  • Download
    0

Embed Size (px)

DESCRIPTION

MongoDB’s basic unit of storage is a document. Documents can represent rich, schema-free data structures, meaning that we have several viable alternatives to the normalized, relational model. In this talk, we’ll discuss the tradeoff of various data modeling strategies in MongoDB using a library as a sample application. You will learn how to work with documents, evolve your schema, and common schema design patterns.

Citation preview

Page 1: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Consulting Engineer, 10gen

Jason Zucchetto

#MongoSF

Schema Design

Page 2: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Single Table En

Agenda

• Working with documents

• Evolving a Schema

• Queries and Indexes

• Common Patterns

Page 3: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Terminology

RDBMS MongoDB

Database ➜ Database

Table ➜ Collection

Row ➜ Document

Index ➜ Index

Join ➜ Embedded Document

Foreign Key ➜ Reference

Page 4: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Working with Documents

Page 5: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Modeling Data

Page 6: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

DocumentsProvide flexibility and performance

Page 7: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Normalized Data

Page 8: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

De-Normalized (embedded) Data

Page 9: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Relational Schema DesignFocus on data storage

Page 10: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Document Schema DesignFocus on data use

Page 11: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Data Access

• Flexible Schemas

• Ability to embed complex data structures

• Secondary Indexes

• Multi-Key Indexes

• Aggregation Framework– $project, $match, $limit, $skip, $sort, $group,

$unwind

• No Joins

Page 12: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Getting Started

Page 13: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Library Management Application• Patrons

• Books

• Authors

• Publishers

Page 14: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleOne to One Relations

Page 15: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader”}

address = { patron_id = "joe", street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345}

Modeling Patrons

patron = { _id: "joe", name: "Joe Bookreader", address: { street: "123 Fake St. ", city: "Faketon", state: "MA", zip: 12345 }}

Page 16: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

One to One Relations

• Mostly the same as the relational approach

• Generally good idea to embed “contains” relationships

• Document model provides a holistic representation of objects

Page 17: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleOne To Many Relations

Page 18: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …} ]}

Modeling Patrons

Page 19: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Publishers and Books

• Publishers put out many books

• Books have one publisher

Page 20: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: English

Publisher: O’Reilly Media, CA

Book

Page 21: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Modeling Books – Embedded Publisher

Page 22: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Modeling Books & Publisher Relationship

Page 23: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA"}

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly"}

Publisher _id as a Foreign Key

Page 24: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ]}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

Book _id as a Foreign Key

Page 25: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Where Do You Put the Foreign Key?

• Array of books inside of publisher– Makes sense when many means a handful of

items– Useful when items have bound on potential

growth

• Reference to single publisher on books– Useful when items have unbounded growth

(unlimited # of books)

• SQL doesn’t give you a choice, no arrays

Page 26: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Another ExampleOne to Many Relations

Page 27: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Books and Patrons

• Book can be checked out by one Patron at a time

• Patrons can check out many books (but not 1000’s)

Page 28: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }}

book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ...}

Modeling Checkouts

Page 29: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ]}

Modeling Checkouts

Page 30: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

De-normalize for speed

DenormalizationProvides data locality

Page 31: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

patron = { _id: "joe", name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure",

... }, ... ]}

Modeling Checkouts: Denormalized

Page 32: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Referencing vs. Embedding• Embedding is a bit like pre-joined data

• Document-level ops are easy for server to handle

• Embed when the 'many' objects always appear with (i.e. viewed in the context of) their parent

• Reference when you need more flexibility

Page 33: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleSingle Table Inheritance

Page 34: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), kind: "loanable", locations: [ ... ], pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}

Single Table Inheritance

Page 35: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleMany to Many Relations

Page 36: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors = [ { _id: "kchodorow", name: "K-Awesome" }, { _id: "mdirolf", name: "Batman Mike" }, ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}

Books and Authors

Page 37: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleTrees

Page 38: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", category: "MongoDB"}

category = { _id: MongoDB, parent: "Databases" }category = { _id: Databases, parent: "Programming" }

Parent Links

Page 39: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}

category = { _id: MongoDB, children: [ 123456789, … ] }category = { _id: Databases, children: ["MongoDB", "Postgres"}category = { _id: Programming, children: ["DB", "Languages"] }

Child Links

Page 40: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Modeling Trees

• Parent Links

- Each node is stored as a document

- Contains the id of the parent

• Child Links

- Each node contains the id’s of the children

- Can support graphs (multiple parents / child)

Page 41: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", categories: ["Programming", "Databases", "MongoDB” ]}

book = { title: "MySQL: The Definitive Guide", authors: [ "Michael Kofler" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", parent: "MongoDB", ancestors: [ "Programming", "Databases", "MongoDB"]}

Array of Ancestors

Page 42: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

An ExampleQueues

Page 43: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", available: 3}

db.books.findAndModify({ query: { _id: 123456789, available: { "$gt": 0 } }, update: { $inc: { available: -1 } }})

Book Document

Page 44: MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

Consulting Engineer, 10gen

Jason Zucchetto

#MongoSF

Thank You