(PFC402) Bigger, Faster: Performance Tips for High Speed and High Volume Applications | AWS...

November 13, 2014 | Las Vegas, NV

Ben Clay, Amazon DynamoDB

Brett McCleary, Precision Exams

• Independent throughput or storage scaling

• Supports both document and key-value data models

Example Schema: Webstore Orders

Hash Key (string) Customer ID

Range Key (string) Timestamp

Attribute (map) Item ID : quantity map

Attribute (number) Customer ID

Item #1 #2 …

Partition #1 Partition #2 Partition #3

Item #N• One item, one partition

• Placement based on key

Item #1 #2 …

Partition #1 Partition #2 Partition #3

Provisioned

Throughput1000 WPS 1000 WPS 1000 WPS

Partition 1

Partition 3Partition 2

Client

Partition 4

Client

Client Machine

Application

DynamoDB

Client Machine

Application

DynamoDB

Client Machine

Application

DynamoDB

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Provisioned

Consumed

Partition 1

Client

Partition 4Client

Partition 1Client

Client

Table grows over time, skew becomes noticeable

Webstore Orders

Hash Customer ID

Range Timestamp

Attrib Items ordered

Attrib Order ID

Orders

Partition 1

Client

Partition 4

Client

Webstore Orders

Hash Customer ID

Range Timestamp

Attrib Order ID

Attrib Household ID

Household Index

Hash Household ID

Range Timestamp

Attrib Order ID

Attrib Customer ID

Indexing

TableItem #1

Item #3Item #2

IndexItem #1

Item #3Item #2

Clients

Read +

Clients

Day of Order Index

Hash Day of order

Range Order ID

Attrib Customer ID

Attrib Timestamp

Webstore Orders

Hash Customer ID

Range Timestamp

Attrib Order ID

Attrib Day of order

Indexing

Day of

indexTable

Partition 1

Client

orders

Day of

indexTable

Partition 1

Client

orders

… …

Alternate Approach: Scanning

H: Alice

R: Oct 2

Partition #1 Partition #2

H: Alice

R: Nov 11

H: Alice

R: Dec 25

H: Bob

R: Oct 20

H: Bob

R: Nov 12

H: Bob

R: Dec 23

P1 P2 P3 P4 P5 P6 P7 P8 P9

• Delete old items from the client side

H: Alice

R: Oct 2

H: Alice

R: Nov 11

H: Alice

R: Dec 25

H: Bob

R: Oct 20

H: Bob

R: Nov 12

H: Bob

R: Dec 23

• Takeaway: Table growth can impact throughput per key

• Important when: Accumulating infrequently-read data

• Controlling table growth with deletes works but…

• Deleting items from client = 2x write cost!

• Can we achieve cheaper deletes AND scans?

H: Alice

R: Oct 2

H: Alice

R: Nov 11

H: Alice

R: Dec 25

H: Bob

R: Oct 20

H: Bob

R: Nov 12

H: Bob

R: Dec 23

Scan for last month

H: Alice

R: Dec 23

Dec Table

Hash: Bob

R: Dec 25

H: Alice

R: Oct 2

Oct Table

Hash: Bob

R: Oct 20

• Takeaway: Time series data chunks very well

• Important when: Big, growing time series tables

H: Alice

R: Nov 11

Nov Table

Hash: Bob

R: Nov 12

Scan last month

Orders TableItem #1Item #2Item #3

Orders TableItem #1Item #2

2. Export

1. ScanItem #3

Orders TableItem #1Item #2

Stream

Processor

1. Update

records

You need near-realtime data

2. Export

“It is not the strongest of the species

that survives, nor the most

intelligent, but the one most

adaptable to change.”

-- Charles Darwin

web nweb02web02

Web Tier

app01 app02 app n

Application Tier

web nweb02web02

Web Tier

app01 app02 app n

Application Tier

Data warehouse process

web nweb02web02

Web Tier

app01 app02 app n

Application Tier

Amazon

DynamoDB

Test Packet Answer Record

Hash Key (string) Test Packet ID

Attribute (string) Answer JSON

Test Packet Response Record

Hash Key (string) Test Packet ID

Range Key (string) Test Packet Response ID

Attribute (string) Create Timestamp

Attribute (string) Post Date

Attribute (string) Response JSON

"testPacketId":11193654,

"answerJson": {

"SQ22545":{"responses":{"010":"Y"},"awardedPts":1},

"SQ22546":{"responses":{"040":"Y"},"awardedPts":1},

"21137":{"responses":{"030":"Y"},"awardedPts":0}

"testPacketId":11193654,

"testPacketResponseId":"SQ22545",

"createdTimeStamp":"1412609315419",

"postDate":"1412609315419",

"responseJson":{"i":"26492","f":"N","t":0,"r":{"010":"Y"}}

tst02 Amazon

DynamoDB

Client

ClientG

rinder

Client

ClientAmazon

DynamoDB

300 600 1200 2200 5000

Provisioned Capacity (units)

P90 Client Latency (ms) Avg. Server Latency (ms) Avg. Client Latency (ms)

http://bit.ly/awsevals

(PFC402) Bigger, Faster: Performance Tips for High Speed and High Volume Applications | AWS...

Technology

AWS re:Invent 2016: Get the Most from AWS KMS: Architecting Applications for High Security (SEC303)

AWS re:Invent 2016: Advanced Tips for Amazon EC2 Networking and High Availability (GPST401)

2014 AWS Re:Invent sharing

re:Invent 2013-foster-madduri

Bigger Brains Led to Bigger Bodies?

"Re:Invent Recruiting," the iRecruit Keynote

AWS re:Invent 2017 Recap

(MED305) Achieving Consistently High Throughput for Very Large Data Transfers with Amazon S3 | AWS re:Invent 2014

AWS re:invent 2016 후기

AWS re:Invent 2016: High Performance Computing on AWS (CMP207)

SpanSet Xtirpa XLT. Bigger, BIGGER, BIGGEST!

AWS re:Invent Hackathon

Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AWS re:Invent 2013

re:Invent 2012 Optimizing Cassandra

Navigating AWS re:Invent 2015

Continuous Deployment @ AWS Re:Invent

New PDF Document · 2018-02-13 · bigger and bigger and bigger! Soon Barney's big balloon climbed high into the sky! 'That cloud looks just like a turtle," said Baby Bop. BJ looked

UniGroup Learning Conference - "#NOFILTER - Build Bigger Better Profits with High Trust Relationships"

re:invent 2017 サービスレポート

200 SERIES SKID STEERS LOADERS/COMPACT TRACK LOADERS€¦ · 130 150-450 1155 High L223 and bigger 130 200-600 1260 High L223 and bigger APPLICATION: Trenching jobs on hard, compact