54
Build High-Scale Applications with Amazon DynamoDB David Pearson Business Development AWS Database Services

Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Build High-Scale Applications

with Amazon DynamoDB

David Pearson Business Development AWS Database Services

Page 2: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Databases in the Cloud data tier workload data store planning prioritization selection

Page 3: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Traditional Database Architecture

App/Web Tier

Client Tier

RDBMS

Page 4: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• key-value access

• complex queries

• transactions

• analytics

One Database for All Workloads

App/Web Tier

Client Tier

RDBMS

Page 5: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Cloud Data Tier Architecture

App/Web Tier

Client Tier

Data Tier

Search Cache Blob Store

RDBMS NoSQL Data Warehouse

Page 6: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Workload Driven Data Store Selection

Data Tier

Search Cache Blob Store

RDBMS NoSQL Data Warehouse

logging analytics

key/value simple query

rich search

transaction processing

hot reads

Page 7: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

AWS Services for the Data Tier

Data Tier

Amazon DynamoDB

Amazon RDS

Amazon ElastiCache

Amazon S3

Amazon Redshift

Amazon CloudSearch

logging analytics

key/value simple query

rich search

transaction processing

hot reads

Page 8: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Simple Guide to Database Selection

Predominant Requirement Recommendation

Seamless scale and super availability Amazon DynamoDB

Complex query workloads and need relational capabilities

Amazon RDS

Caching Amazon ElastiCache

Deep analytics Amazon Redshift

Cases where these services are not the right fit

Build your own on EC2!

Page 9: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Fundamentals

original tier one use case applications

Page 10: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

RDBMS = Default Choice

• Amazon.com page composed of responses from 1000’s of independent services

• Query patterns for different service are different Catalog service is usually heavy key-value

Ordering service is very write intensive (key-value)

Catalog search has a different pattern for querying

Relational Era @ Amazon.com

RDBMS

Poor Availability Limited Scalability High Cost

Page 11: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Dynamo = NoSQL Technology

• Replicated DHT with consistency management

• Consistent hashing

• Optimistic replication

• “Sloppy quorum”

• Anti-entropy mechanisms

• Object versioning

Distributed Era @ Amazon.com

lack of strong every engineer needs to operational consistency learn distributed systems complexity

Page 12: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB = NoSQL Cloud Service

Cloud Era @ Amazon.com

Non-Relational

Fast & Predictable Performance

Seamless Scalability

Easy Administration

Tier One Applications

Page 13: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

database service

automated operations predictable performance

durable low latency cost effective

=

Page 14: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Developers are freed from: Performance tuning (latency)

Automatic 3-way multi-AZ replication

Scalability (and scaling operations)

Security inspections, patches, upgrades

Software upgrades, patches

Automatic hardware failover

Improving the underlying hardware

…and lots of other stuff

Built to make life easier for developers

Page 15: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Provisioned Throughput • Request-based capacity provisioning model

• Throughput is declared and updated via the API or the console CreateTable (foo, reads/sec = 100, writes/sec = 150) UpdateTable (foo, reads/sec=10000, writes/sec=4500)

• DynamoDB handles the rest Capacity is reserved and available when needed Scaling-up triggers repartitioning and reallocation No impact to performance or availability

Predictable Performance

Page 16: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

WRITES Continuously replicated to 3 AZ’s Quorum acknowledgment Persisted to disk (custom SSD)

READS Strongly or eventually consistent

No trade-off in latency

Durable Low Latency

Page 17: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Primitives simple rich query fast application API support development

Page 18: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Concepts

table

Page 19: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Concepts

table

items

Page 20: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Concepts

attributes

items

table

schema-less schema is defined per attribute

Page 21: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Concepts

attributes

items

table

scalar data types • number, string, and binary multi-valued types • string set, number set, and binary set

Page 22: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

DynamoDB Concepts

hash

hash keys mandatory for all items in a table key-value access pattern

PutItem UpdateItem DeleteItem BatchWriteItem

GetItem BatchGetItem

Page 23: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Hash = Distribution Key

partition 1 .. N

hash keys mandatory for all items in a table key-value access pattern determines data distribution

Page 24: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Hash = Distribution Key

large number of unique hash keys

uniform distribution of workload across hash keys

! +

Page 25: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Range = Query

range

hash

range keys model 1:N relationships enable rich query capabilities composite primary key

all items for a hash key ==, <, >, >=, <= “begins with” “between” sorted results counts top / bottom N values paged responses

Page 26: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Index Options

local secondary indexes (LSI) alternate range key + same hash key index and table data is co-located (same partition)

Page 27: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Projected Attributes

KEYS_ONLY INCLUDE ALL

Page 28: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Projected Attributes

KEYS_ONLY INCLUDE ALL

Page 29: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Projected Attributes

KEYS_ONLY INCLUDE ALL

Page 30: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Index Options

global secondary indexes (GSI)

any attribute indexed as new hash or range key

KEYS_ONLY INCLUDE ALL

Page 31: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Designing for Scale access pattern use case under-rated modeling walk-thru features

Page 32: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Method 1. Describe the overall use case – maintain context

2. Identify the individual access patterns of the use case

3. Model each access pattern to its own discrete data set

4. Consolidate data sets into tables and indexes

• Benefits Single table fetch for each query

Payloads are minimal for each access

Access Pattern Modeling

Page 33: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Multi-tenant application for file storing and sharing

• UserId is the unique identifier of each user

• FileId is the unique identifier of each file, owner by user

Use Case Walk Thru

Good PK selection: UserId (hash) + FileId (range)

use case access patterns data design

Page 34: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

1. Users should be able to query all the files they own

2. Search by File Name

3. Search by File Type

4. Search by Date Range

5. Keep track of Shared Files

6. Search by descending order or File Size

Use Case Walk Thru

use case access patterns data design

additional (non-PK) attributes & index candidates

Page 35: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Users

• Hash key = UserId (S)

• Attributes = UserName (S), Email (S), Address (SS) ...

User_Files

• Hash key = UserId (S)

• Range key = FileId (S)

• Attributes = Name (S), Type (S), Size (N), Date (S), SharedFlag (S), S3key (S)

DynamoDB Data Model

Page 36: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Secondary Indexes

Table Name Index Name Attribute to Index Projected Attribute

User_Files NameIndex Name KEYS

User_Files TypeIndex Type KEYS + Name

User_Files DateIndex Date KEYS + Name

User_Files SharedFlagIndex SharedFlag KEYS + Name

User_Files SizeIndex Size KEYS + Name

example only – required data returned determines optimal projections

Page 37: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Find all files owned by a user

Query User_Files table (UserId = 2)

Access Pattern 1

UserId (Hash)

FileId (Range)

Name Date Type SharedFlag Size S3key

1 1 File1 2013-04-23 JPG 1000 bucket\1

1 2 File2 2013-03-10 PDF Y 100 bucket\2

2 3 File3 2013-03-10 PNG Y 2000 bucket\3

2 4 File4 2013-03-10 DOC 3000 bucket\4

3 5 File5 2013-04-10 TXT 400 bucket\5

Page 38: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Search by file Name

Query (IndexName = NameIndex, UserId = 1, Name = File1)

Access Pattern 2

UserId (hash)

Name (range)

FileId

1 File1 1

1 File2 2

2 File3 3

2 File4 4

3 File5 5

NameIndex

Page 39: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Search for file name by file Type

Query (IndexName = TypeIndex, UserId = 2, Type = DOC)

Access Pattern 3

UserId (hash)

Type (range)

FileId Name

1 JPG 1 File1

1 PDF 2 File2

2 DOC 4 File4

2 PNG 3 File3

3 TXT 5 File5

projection

TypeIndex

Page 40: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Search for file name by Date range

Query (IndexName = DateIndex, UserId = 1, Date between 2013-03-01 and 2013-03-29)

Access Pattern 4

UserId (hash)

Date (range)

FileId Name

1 2013-03-10 2 File2

1 2013-04-23 1 File1

2 2013-03-10 3 File3

2 2013-03-10 4 File4

3 2013-04-10 5 File5

DateIndex

projection

Page 41: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Search for names of Shared files

Query (IndexName = SharedFlagIndex, UserId = 1, SharedFlag = Y)

Access Pattern 5

UserId (hash)

SharedFlag (range)

FileId Name

1 Y 2 File2

2 Y 3 File3

SharedFlagIndex

projection

Page 42: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Query for file names by descending order of file Size

Query (IndexName = SizeIndex, UserId = 1, ScanIndexForward = false)

Access Pattern 6

UserId (hash)

Size (range)

FileId Name

1 100 1 File1

3 400 2 File2

1 1000 3 File3

2 2000 4 File4

2 3000 5 File5

SizeIndex projection

Page 43: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Users

• User_Files NameIndex

TypeIndex

DateIndex

SharedFlagIndex

SizeIndex

Use Case Walk Thru

use case access patterns data design

Secondary Indexing Options

Local (LSI) Global (GSI)

Strongly or eventually consistent) reads

Eventually consistent reads only

Total storage limit per hash key (10GB)

No storage limit

Read and write units consumed from the table

Read and write units defined per GSI

Page 44: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Consistent Reads Inventory, shopping cart applications

• Atomic Counters Increment and return new value in same operation

• Conditional Writes Expected value before write – fails on mismatch

“state machine” use cases

• Sparse Indexes Optimal for accessing boolean values

Popular: identify updated items for background clean-up process

Under-Rated Features

Page 45: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Moving to DynamoDB denormalized live NoSQL leaving entity-relationship migrations DynamoDB

Page 46: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Migration Considerations

• Data Layer Single-table use case

Denormalized preferred – to reduce multi-object access • Can be done on target design for high-scale patterns

Minimal dependencies (FK’s, triggers, procedures)

Design target for scale

• Application Code Review and validate access patterns

API / SQL mapping and conversion • DynamoDB = simple API’s… reads (2); writes (4); rich query (1); scan (1)

Page 47: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• Design for Scale Optimal target environment

• Identify minimal data unit to be migrated (user data)

• Plan migration activities Map source to target

Identify transforms

Update app to work with new target

Test and verify

• Implement the migration Triggered by user login

Login simulation (loop)

NoSQL – Live Migration Checklist (example)

Page 48: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Data Movement Templates

Page 50: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Getting Started zero cost use case design dev & test validation workshop

Page 51: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• DynamoDB Local Disconnected development with full API support

• No network

• No usage costs

• No SLA

http://aws.amazon.com/dynamodb/developer-resources/

• Use Case Assessment Want to test an idea, but unsure if your use case is a good fit?

• Design Call Ready to start coding (or scaling) and want to ensure optimal design?

Next Steps – Hands On

Page 52: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Register at: http://aws.amazon.com/about-aws/events/

Databases in the Cloud Webinar Series

• Data Modeling and Best Practices for Scaling your Application with Amazon DynamoDB

• February 27, 2014 10:00am PST

High-scale applications like social gaming, chat, and voting

Model these applications using DynamoDB, including how to use building blocks such as conditional writes, consistent reads, and batch operations.

Incorporate best practices such as index projections, item sharding, and parallel scan for maximum scalability

Next Steps – More Information

Page 53: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

• The Mill DTLV – AWS Perk

• Feb 27 – DynamoDB Webinar (Advanced Schema Design)

• March 10 – SXSW AWS Sessions

• March 20 – Las Vegas Meetup – Redshift

• March 26 – AWS Summit – San Francisco

Upcoming AWS Events

Page 54: Build High-Scale Applications with Amazon DynamoDBfiles.meetup.com/10649082/Build High-Scale Applications with Amazon... · Amazon DynamoDB Amazon RDS Amazon Amazon ElastiCache S3

Questions David Pearson [email protected]