View
354
Download
3
Category
Tags:
Preview:
DESCRIPTION
When you use MongoDB for the first time, the biggest risk is to apply the same patterns and designs used in the SQL world, in this way you miss the real change that SQL MongoDB requires: change the way of thinking.
Citation preview
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Application Design
FOR MongoDB
Alessandro Palumboapalumbo@byte-code.com
http://it.linkedin.com/in/alessandropalumbo/ http://www.byte-code.com
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
MongoDB
from humongous “huge; enormous”
NoSql
OPEN-source
Document-OrientedJSON-style documents
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
JSON-style documents
{ "_id" : "6c85fa4c-fa64-44e2-89c9-e5eb7f306ed7", "code" : "CRS0001", "name" : "Test", "description" : "Test description", "active" : true, "scheduledDate" : { "from" : ISODate("2013-09-12T00:00:00.000Z"), "to" : ISODate("2013-10-31T00:00:00.000Z") }, "version" : NumberLong(1) }
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
don’t be relationaL
no joins
NO FULL transactions
no SCHEMA
WE CAN EMBED
IS IT REALLY AN ISSUE?
DOCUMENT LEVELTRANSACTIONS
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
DESIGN
DESIGN
FOR
QUERYEMBEDDED
DATA
vs
References
DBREFS
VS
MANUAL
REFERENCE
DYNAMIC
SCHEMA
VS
static
languages
PURE DRIVER
VS
MAPPING
FRAMEWORKS
BE
CAREFUL
WITH
DATES
SPLIT DATA
ON
MULTIPLE
COLLECTIONS
friendly fire(aka RTFM)
Write
Concern
READ
PREFERENCE
ATOMIC
DOCUMENT
OPERATIONS
AVOID
NATURAL
KEYS AS
IDENTIFIERS
PERFORMANCE
PREALLOCATE
FIELDS?
be aware
of
the trees
PREPROCESS
HIGH
RESOLUTION
DATA
TUNING
UPDATES
AND
INSERTS
DOCUMENT
MOVING
SLOWS
YOU
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
FRIENDLY FIRE
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
ATOMIC
DOCUMENT
OPERATIONS
OPERATIONS ON MULTIPLE DOCUMENTS ARE NOT ATOMIC
NO “ALL OR NOTHING”
EMBEDding OR APPLIcaTION TRANSACTIONS CAN be used to handle the issue
RELATIONAL TRANSACTIONS ARE NOT TOTALLY SAFE
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
“Describes the guarantee that
MongoDB provides when reporting on the success of a write
operation”
Write
Concern
IT IS SET BY THE CLIENT AND CAN BE SET FOR EACH OPERATION
Errors Ignored Unacknowledged
Acknowledged (*) Journaled
Replica Acknowledged> 1 , majority , custom using tags
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
“IT describes how MongoDB clients
route read operations to members
of a replica set”
Read
Preference
primary (*)
nearest
primary Preferred
secondary secondary PREFERRED
IT IS SET BY THE CLIENT AND CAN BE SET FOR EACH OPERATION
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
All collections have an index on the id field that exists by default. If ID IS NOT PROVIDED the driver or the mongod will create an _id field with an ObjectID value.
AVOID
NATURAL
KEYS AS
IDENTIFIERS
ADD AN UNIQUE INDEX ON THE NATURAL KEY, SOMETIMES THE APPLICATION REALM CAN EVOLVE IN AN UNEXPECTED WAY
REMEMBER THAT UNIQUE INDEXES FIELDS MUST BE PART OF THE SHARD KEY IF SHARDING IS ENABLED
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
DESIGN
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
DOCUMENT DESIGN IS FUNCTIONAL TO THE QUERIES THAT WILL EXISTS IN THE APPLICATION
DESIGN
FOR
QUERY
REFERENCE OR EMBED DOCUMENTS,
“denormalized” is not always
a bad word
your document design will affect what kind of OPERATIONS will be safe or not
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
Embedded data models allow applications to store related pieces of information in the same database record
EMBEDDED
DATA
vs
References
The maximum BSON document size is 16 megabytes and embedding may lead to performance issues if not correctly used
USUALLY there is a “contains” relation
between the embedding and the embedded object
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
Normalized data models describe relationships using references between documents
EMBEDDED
DATA
vs
References
NO Referential integrity is supported, references could point to a not existing object
References provides more flexibility than embedding but remember that client-side applications will have to lookup for referenced objects with multiple queries
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
DBRefs are a convention for representing a document, it will hold the collection name, the id, and optionally the db name
DBREFS
VS
MANUAL
REFERENCE
MANUAL REFERENCES are just fields that will hold the id of the related document, without the collection name or the db name
MANUAL REFERENCES are suitable for most of the use cases
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
BSON Date is a 64-bit signed integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970), Negative values represent dates before 1970.The official BSON specification refers to the BSON Date type as the UTC datetime.
BE
CAREFUL
WITH
DATES
ALWAYS Use bson date when is related to an instant of time or you will never be able to use operators on that fields
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
split data on multiple collections to easily partition your data (a.k.a. Multitenancy)
SPLIT DATA
ON
MULTIPLE
COLLECTIONS
use collections as namespaces for your data
remember once data is partioned it will be more hard to aggregate if needed
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
why use dynamic schema if we are not using a dynamic programming language?
DYNAMIC
SCHEMA
VS
static
languages
inheritance is not only a matter of hierarchy, it could be also a matter of composition
composition is the key to introduce dynamic schema in a static programming language
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
using the mongo driver directly will give you great powers, but will force you to write a lot of boilerplate code
PURE DRIVER
VS
MAPPING
FRAMEWORKS
MAPPING FRAMEWORKS WILL HELP TO WRITE LESS CODE, but you will sacrifice the control on all the aspects of the persistence
why not take the most from both?
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
PERFORMANCE
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
Indexes in MongoDB are defined at the collection level and can be on any field or sub-field of the document
be aware
of
the trees
Indexes are created using a b-tree and can be of different types
Single Field Compound
Multikey Geospatial
TEXT (BETA) Hashed
THEY COULD BE UNIQUE and sparse
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
MONGODB handle the space allocation of a RECORD considering also a PADDING FACTOR
DOCUMENT
MOVING
SLOWS
YOU
WHEN AN UPDATED DOCUMENT DOES NOT FIT IN THE RECORD SPACE IT WILL BE MOVED
DYNAMIC SCHEMA IS THE FIRST CAUSE OF DOCUMENT MOVING
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
FIELDS PREALLOCATION CAN FIX THE DOCUMENT MOVING ISSUES IN SOME USE CASES
PREALLOCATE
FIELDS?
Default values must be used to preallocate, this MUST BE HANDLEDin the application
NULL is not a default value :-) as it has its own type
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
MONGODB let you store the maximum resolution of your data
PREPROCESS
HIGH
RESOLUTION
DATA
MAP REDUCE and aggregation ARE okbut you could also preprocess and have aggregated data that you can use for your queries
MONGODB rocks for business intelligence
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Alessandro Palumbo - apalumbo@byte-code.com - http://www.byte-code.com
MongoDB stores BSON documents as a sequence of fields and values, not as aN hash table
TUNING
UPDATES
AND
INSERTS
WRITING THE FIRST FIELD OF A DOCUMENT (OR A NESTED DOCUMENT) is considerably faster than writing THE LAST
Intra-Document Hierarchy could help to handle the issue
Except where otherwise noted, this work is licensed under: http://creativecommons.org/licenses/by/3.0/
Any questions?
Recommended