Deep dive on DynamoDB to create scalable app Eduardo Horai
AWS Solutions Architect
What is DynamoDB?
1
DynamoDB is a managed NoSQL database service.
Store and retrieve any amount of data.
Serve any level of request traffic.
Without the operational burden.
Consistent, predictable performance.
Single digit millisecond latency.
Backed on solid-state drives.
Key/attribute pairs. No schema required.
Easy to create. Easy to adjust.
Flexible data model.
No table size limits. Unlimited storage.
No downtime.
Seamless scalability.
Consistent, disk only writes.
Replication across data centers and availability zones.
Durable.
Focus on your app.
Reserve IOPS for reads and writes.
Scale up for down at any time.
Provisioned throughput.
Pay per capacity unit.
Priced per hour of provisioned throughput.
Size of item x writes per second
Write throughput.
>= 1KB
Atomic increment and decrement.
Optimistic concurrency control: conditional writes.
Consistent writes.
Item level transactions only.
Puts, updates and deletes are ACID.
Transactions.
Read throughput.
Strong or eventual consistency
Read throughput.
Strong or eventual consistency
Provisioned units = size of item x reads per second >= 4KB
Read throughput.
Strong or eventual consistency
Provisioned units = size of item x reads per second 2
Read throughput.
Strong or eventual consistency
Same latency expectations.
Mix and match at ‘read time’.
Provisioned throughput is managed by DynamoDB.
Data is partitioned and managed by DynamoDB.
• DynamoDB automatically partitions data by the hash key – Hash key spreads data & workload across partitions
• Auto-Partitioning driven by: – Data set size – Provisioned Throughput
• Tip: large number of unique hash keys and uniform distribution of workload across hash keys lends well to massive scale!
Partitioning
Tiered bandwidth pricing: aws.amazon.com/dynamodb/pricing
Indexed data storage.
Up to 53% for 1 year reservation.
Up to 76% for 3 year reservation.
Reserved capacity.
Session based to minimize latency. Uses the Amazon Security Token Service.
Handled by AWS SDKs. Integrates with IAM.
Authentication.
CloudWatch metrics: latency, consumed read and write throughput,
errors and throttling.
Monitoring.
Libraries, mappers and mocks.
ColdFusion, Django, Erlang, Java, .Net, Node.js, Perl, PHP, Python, Ruby
http://j.mp/dynamodb-libs
NoSQL Data Modeling
2
id = 100
date = 2012-05-16-12-00-10 id = 101 total = 100.00
total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
date = 2012-05-16-09-00-10
id = 100 date = 2012-05-16-09-00-10 total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
Table
date = 2012-05-16-12-00-10 id = 101 total = 100.00
id = 100 total = 25.00
id = 101 date = 2012-05-15-15-00-11 total = 35.00
Item
date = 2012-05-16-12-00-10 id = 101 total = 100.00
date = 2012-05-16-09-00-10
id = 100
2012-05-15-15-00-11
total = 25.00 Attribute
date = 2012-05-16-12-00-10 id = 101 total = 100.00
id = 101 date = total = 35.00
date = 2012-05-16-09-00-10
Tables do not require a formal schema.
Items are an arbitrarily sized hash.
Where is the schema?
Items are indexed by primary and secondary keys.
Primary keys can be composite.
Secondary keys are local to the table.
Indexing.
ID Date Total
ID Date Total
Hash key
ID Date Total
Hash key Range key
Composite primary key
ID Date Total
Hash key Range key Secondary range key
Programming DynamoDB.
Small but perfectly formed API.
CreateTable
Scan
UpdateTable
DeleteTable
DescribeTable
ListTables
Query
PutItem
GetItem
UpdateItem
DeleteItem
BatchGetItem
BatchWriteItem
CreateTable
Scan
UpdateTable
DeleteTable
DescribeTable
ListTables
Query
PutItem
GetItem
UpdateItem
DeleteItem
BatchGetItem
BatchWriteItem
CreateTable
Scan
UpdateTable
DeleteTable
DescribeTable
ListTables
Query
PutItem
GetItem
UpdateItem
DeleteItem
BatchGetItem
BatchWriteItem
PutItem, UpdateItem, DeleteItem can take optional conditions for operation.
UpdateItem performs atomic increments.
Conditional updates.
One API call, multiple items
BatchGet returns multiple items by key.
BatchWrite performs up to 25 put or delete operations.
Throughput is measured by IO, not API calls.
CreateTable
UpdateTable
DeleteTable
DescribeTable
ListTables
Query
Scan
PutItem
GetItem
UpdateItem
DeleteItem
BatchGetItem
BatchWriteItem
Query returns items by key.
Scan reads the whole table sequentially.
Query vs Scan
Query patterns
Retrieve all items by hash key.
Range key conditions: ==, <, >, >=, <=, begins with, between.
Counts. Top and bottom n values. Paged responses.
Example
3
…
AmazonDynamoDBClient dynamoDB; = new AmazonDynamoDBClient( new ClasspathPropertiesFileCredentialsProvider()); dynamoDB.setRegion(Region.getRegion(Regions. SA_EAST_1));
Players user_id =
mza location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
CreateTableRequest createPlayersTable = new CreateTableRequest().withTableName("Players") .withKeySchema(new KeySchemaElement().withAttributeName("user_id") .withKeyType(KeyType.HASH)) .withAttributeDefinitions(newAttributeDefinition() .withAttributeName("user_id").withAttributeType(ScalarAttributeType.S)) .withProvisionedThroughput(new ProvisionedThroughput() .withReadCapacityUnits(10L) .withWriteCapacityUnits(10L)); dynamoDB.createTable(createPlayersTable);
Players
Scores
user_id = mza
location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
CreateTableRequest createScoresTable = new CreateTableRequest().withTableName(”Scores") .withKeySchema(new KeySchemaElement().withAttributeName("user_id") .withKeyType(KeyType.HASH)) .withAttributeDefinitions(newAttributeDefinition() .withAttributeName("user_id").withAttributeType(ScalarAttributeType.S)) .withKeySchema(new KeySchemaElement().withAttributeName(”game") .withKeyType(KeyType.RANGE)) .withAttributeDefinitions(newAttributeDefinition() .withAttributeName(”game").withAttributeType(ScalarAttributeType.S)) .withProvisionedThroughput(new ProvisionedThroughput() .withReadCapacityUnits(100L) .withWriteCapacityUnits(100L));
Players
Scores Leader boards
user_id = mza
location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
CreateTableRequest createLeaderBoardsTable = new CreateTableRequest().withTableName(”LeaderBoards") .withKeySchema(new KeySchemaElement().withAttributeName(”game") .withKeyType(KeyType.HASH)) .withAttributeDefinitions(newAttributeDefinition() .withAttributeName(”game").withAttributeType(ScalarAttributeType.S)) .withKeySchema(new KeySchemaElement().withAttributeName(”score") .withKeyType(KeyType.RANGE)) .withAttributeDefinitions(newAttributeDefinition() .withAttributeName(”score").withAttributeType(ScalarAttributeType.N)) .withProvisionedThroughput(new ProvisionedThroughput() .withReadCapacityUnits(50L) .withWriteCapacityUnits(50L));
Players
Scores Leader boards
Query for user user_id =
mza location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
Map<String, Condition> keyConditions = new HashMap<String, Condition>(); keyConditions.put("user_id", new Condition() .withComparisonOperator(ComparisonOperator.EQ.toString()) .withAttributeValueList(new AttributeValue().withS("mza"))); QueryRequest queryRequest = new QueryRequest() .withTableName("Players") .withKeyConditions(keyConditions);
QueryResult result = dynamoDB.query(queryRequest); for (Map<String, AttributeValue> item : result.getItems()) { printItem(item); }
Players
Scores Leader boards
Query for scores by user
user_id = mza
location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
Map<String, Condition> keyConditions = new HashMap<String, Condition>(); keyConditions.put("user_id", new Condition() .withComparisonOperator(ComparisonOperator.EQ.toString()) .withAttributeValueList(new AttributeValue().withS("mza"))); QueryRequest queryRequest = new QueryRequest() .withTableName(”Scores") .withAttributesToGet(”score”, “game”) .withKeyConditions(keyConditions);
QueryResult result = dynamoDB.query(queryRequest); for (Map<String, AttributeValue> item : result.getItems()) { printItem(item); }
Players
Scores Leader boards
Query for scores by user, game
user_id = mza
location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
Map<String, Condition> keyConditions = new HashMap<String, Condition>(); keyConditions.put("user_id", new Condition() .withComparisonOperator(ComparisonOperator.EQ.toString()) .withAttributeValueList(new AttributeValue().withS("mza"))); keyConditions.put(”game", new Condition() .withComparisonOperator(ComparisonOperator.EQ.toString()) .withAttributeValueList(new AttributeValue().withS(”tetris"))); QueryRequest queryRequest = new QueryRequest() .withTableName(”Scores") .withKeyConditions(keyConditions);
QueryResult result = dynamoDB.query(queryRequest); for (Map<String, AttributeValue> item : result.getItems()) { printItem(item); }
Players
Scores Leader boards
High scores by game
user_id = mza
location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
Map<String, Condition> keyConditions = new HashMap<String, Condition>(); keyConditions.put(”game", new Condition() .withComparisonOperator(ComparisonOperator.EQ.toString()) .withAttributeValueList(new AttributeValue().withS(”tetris"))); QueryRequest queryRequest = new QueryRequest() .withTableName(”LeaderBoards") .withKeyConditions(keyConditions) . withScanIndexForward(false);
QueryResult result = dynamoDB.query(queryRequest); for (Map<String, AttributeValue> item : result.getItems()) { printItem(item); }
Players
Scores Leader boards
Insert Players user_id =
mza location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
Map<String, AttributeValue> itemPlayer = new HashMap<String, AttributeValue>(); itemPlayer.put("user_id", new AttributeValue("eduardohorai")); itemPlayer.put("location", new AttributeValue("Sao Paulo")); itemPlayer.put("joined", new AttributeValue("27/01/2013")); PutItemRequest putItemRequest = new PutItemRequest("Players", itemPlayer); PutItemResult putItemResult = dynamoDB.putItem(putItemRequest);
Players
Scores Leader boards
Increase writes/reads on Scores!!!!!
user_id = mza
location = Cambridge
joined = 2011-07-04
user_id = jeffbarr
location = Seattle
joined = 2012-01-20
user_id = werner
location = Worldwide
joined = 2011-05-15
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = mza
game = tetris
score = 9,000,000
user_id = jeffbarr
user_id = mza
game = angry-birds
score = 11,000
user_id = mza
game = tetris
score = 1,223,000
user_id = werner
game = bejewelled
score = 55,000
UpdateTableRequest updateTableRequest = new UpdateTableRequest() .withTableName("Scores") .withProvisionedThroughput(new ProvisionedThroughput() .withReadCapacityUnits(200L) .withWriteCapacityUnits(200L)); UpdateTableResult result = dynamoDB.updateTable(updateTableRequest);
Using AWS Console
4
§ aws.amazon.com/dynamodb
§ aws.typepad.com/brasil/ § aws.typepad.com
§ awshub.com.br
Links
Questions?
Learn More: aws.amazon.com/dynamodb
Obrigado!
Learn More: aws.amazon.com/dynamodb