32
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Susan Chan Senior Product Manager, Amazon S3 September 2016 Deep Dive on Amazon S3

Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Susan Chan Senior Product Manager, Amazon S3

September 2016

Deep Dive on Amazon S3

Page 2: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Recent innovations on S3

Visibility & control of your data

New storage offering

More data ingestion options

• Standard - Infrequent Access

• Amazon CloudWatch integration

• AWS CloudTrail integration • New lifecycle policies • Event notifications • Bucket limit increases • Read-after-write consistency

• IPv6 support • AWS Snowball (80 TB) • S3 Transfer Acceleration • Amazon Kinesis Firehose • Partner integration

Page 3: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Choice of storage classes on S3

Standard

Active data Archive data Infrequently accessed data

Standard - Infrequent Access Amazon Glacier

Page 4: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

File sync and share +

consumer file storage

Backup and archive + disaster recovery

Long-retained data

Use cases for Standard-Infrequent Access

Page 5: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Designed for 11 9s of durability

Standard - Infrequent Access storage

Designed for 99.9% availability

Durable Available Same as Standard storage

High performance

• Bucket policies • AWS Identity and Access

Management (IAM) policies • Many encryption options

Secure • Lifecycle management • Versioning • Event notifications • Metrics

Integrated • No impact on user

experience • Simple REST API

Easy to use

Page 6: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

- Directly PUT to Standard - IA

- Transition Standard to Standard - IA

- Transition Standard - IA to Amazon Glacier storage

- Expiration lifecycle policy

- Versioning support

Standard - Infrequent Access storage Integrated: Lifecycle management

Standard - Infrequent Access

Page 7: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Transition older objects to Standard - IA

Page 8: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Lifecycle policy

Standard Storage -> Standard - IA

<LifecycleConfiguration>

<Rule>

<ID>sample-rule</ID>

<Prefix>documents/</Prefix>

<Status>Enabled</Status>

<Transition>

<Days>30</Days>

<StorageClass>STANDARD-IA</StorageClass>

</Transition>

<Transition>

<Days>365</Days>

<StorageClass>GLACIER</StorageClass>

</Transition>

</Rule>

</LifecycleConfiguration>

Standard - Infrequent Access storage

Page 9: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Standard Storage -> Standard - IA

<LifecycleConfiguration>

<Rule>

<ID>sample-rule</ID>

<Prefix>documents/</Prefix>

<Status>Enabled</Status>

<Transition>

<Days>30</Days>

<StorageClass>STANDARD-IA</StorageClass>

</Transition>

<Transition>

<Days>365</Days>

<StorageClass>GLACIER</StorageClass>

</Transition>

</Rule>

</LifecycleConfiguration>

Standard - IA Storage -> Amazon Glacier

Standard - Infrequent Access storage

Lifecycle policy

Page 10: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

S3 support for IPv6

Dual-stack endpoints support both IPv4 and IPv6 Same high performance Integrated with most S3 features Manage access with IPv6 addresses Easy to adopt, just change your endpoint. No additional charges

Page 11: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

IPv6 - Getting started

Update your endpoint to • virtual hosted style address http://bucketname.s3.dualstack.aws-region.amazonaws.com Or • path style address http://s3.dualstack.aws-region.amazonaws.com/bucketname

Page 12: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Restricting access by IP addresses

{ "Version": "2012-10-17", "Id": "S3PolicyId1", "Statement": [ { "Sid": "IPAllow", "Effect": "Allow", "Principal": "*", "Action": "s3:*", "Resource": "arn:aws:s3:::examplebucket/*", "Condition": { "IpAddress": {"aws:SourceIp": "54.240.143.0/24"} "NotIpAddress": {"aws:SourceIp": "54.240.143.188/32"} } } ] }

Bucket policy with IPv4

Page 13: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Updating bucket policy with IPv6 { "Version": "2012-10-17", "Id": "S3PolicyId1", "Statement": [ { "Sid": "IPAllow", "Effect": "Allow", "Principal": "*", "Action": "s3:*", "Resource": "arn:aws:s3:::examplebucket/*", "Condition": { "IpAddress": "aws:SourceIp": [ "54.240.143.0/24", "2001:DB8:1234:5678::/64" ]} "NotIpAddress": {"aws:SourceIp": ["54.240.143.128/30", "2001:DB8:1234:5678:ABCD::/80”]}}]}

Page 14: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Data ingestion into S3

Page 15: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

S3 Transfer Acceleration

S3 Bucket AWS Edge Location

Uploader

Optimized Throughput!

Typically 50%-400% faster

Change your endpoint, not your code

No firewall exceptions

No client software required

59 global edge locations

Page 16: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Rio DeJaneiro

Warsaw New York Atlanta Madrid Virginia Melbourne Paris LosAngeles

Seattle Tokyo Singapore

Tim

e [h

rs.]

500 GB upload from these edge locations to a bucket in Singapore

Public Internet

How fast is S3 Transfer Acceleration?

S3 Transfer Acceleration

Page 17: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Getting started

1. Enable S3 Transfer Acceleration on your S3 bucket.

2. Update your endpoint to <bucket-name>.s3-accelerate.amazonaws.com.

3. Done!

Page 18: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

How much will it help me? s3speedtest.com

Page 19: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Tip: Parallelizing PUTs with multipart uploads

• Increase aggregate throughput by parallelizing PUTs on high-bandwidth networks • Move the bottleneck to the network,

where it belongs

• Increase resiliency to network errors; fewer large restarts on error-prone networks

Best Practice

Page 20: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Incomplete multipart upload expiration policy

• Partial upload does incur storage charges

• Set a lifecycle policy to automatically make incomplete multipart uploads expire after a predefined number of days

Incomplete multipart upload expiration

Best Practice

Page 21: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Enable policy with the AWS Management Console

Page 22: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Example lifecycle policy

<LifecycleConfiguration>

<Rule>

<ID>sample-rule</ID> <Prefix>MyKeyPrefix/</Prefix>

<Status>rule-status</Status> <AbortIncompleteMultipartUpload> <DaysAfterInitiation>7</DaysAfterInitiation> </AbortIncompleteMultipartUpload> </Rule>

</LifecycleConfiguration>

Or enable a policy with the API

Page 23: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Tip #1: Use versioning

• Protects from accidental overwrites and deletes

• New version with every upload

• Easy retrieval of deleted objects and roll back to previous versions

Best Practice

Versioning

Page 24: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Tip #2: Use lifecycle policies

• Automatic tiering and cost controls • Includes two possible actions:

• Transition: archives to Standard - IA or Amazon Glacier based on object age you specified

• Expiration: deletes objects after specified time

• Actions can be combined • Set policies at the bucket or prefix level • Set policies for current version or non-

current versions Lifecycle policies

Page 25: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Versioning + lifecycle policies

Page 26: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Expired object delete marker policy

• Deleting a versioned object makes a delete marker the current version of the object

• Removing expired object delete marker can improve list performance

• Lifecycle policy automatically removes the current version delete marker when previous versions of the object no longer exist

Expired object delete marker

Page 27: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Enable policy with the console

Insert console screen shot

Page 28: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Tip #3: Restrict deletes

• Bucket policies can restrict deletes • For additional security, enable MFA (multi-factor

authentication) delete, which requires additional authentication to: • Change the versioning state of your bucket • Permanently delete an object version

• MFA delete requires both your security credentials and a code from an approved authentication device

Best Practice

Page 29: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

<my_bucket>/2013_11_13-164533125.jpg <my_bucket>/2013_11_13-164533126.jpg <my_bucket>/2013_11_13-164533127.jpg <my_bucket>/2013_11_13-164533128.jpg <my_bucket>/2013_11_12-164533129.jpg <my_bucket>/2013_11_12-164533130.jpg <my_bucket>/2013_11_12-164533131.jpg <my_bucket>/2013_11_12-164533132.jpg <my_bucket>/2013_11_11-164533133.jpg <my_bucket>/2013_11_11-164533134.jpg <my_bucket>/2013_11_11-164533135.jpg <my_bucket>/2013_11_11-164533136.jpg

Use a key-naming scheme with randomness at the beginning for high TPS • Most important if you regularly exceed 100 TPS on a bucket • Avoid starting with a date • Consider adding a hash or reversed timestamp (ssmmhhddmmyy)

Don’t do this…

Tip #4: Distribute key names

Page 30: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Distributing key names

Add randomness to the beginning of the key name…

<my_bucket>/521335461-2013_11_13.jpg <my_bucket>/465330151-2013_11_13.jpg <my_bucket>/987331160-2013_11_13.jpg <my_bucket>/465765461-2013_11_13.jpg <my_bucket>/125631151-2013_11_13.jpg <my_bucket>/934563160-2013_11_13.jpg <my_bucket>/532132341-2013_11_13.jpg <my_bucket>/565437681-2013_11_13.jpg <my_bucket>/234567460-2013_11_13.jpg <my_bucket>/456767561-2013_11_13.jpg <my_bucket>/345565651-2013_11_13.jpg <my_bucket>/431345660-2013_11_13.jpg

Page 31: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -

Remember to complete your evaluations!

Page 32: Deep Dive on Amazon S3 · Deep Dive on Amazon S3 . Recent innovations on S3. Visibility & control of your data . New storage offering . More data ingestion options • Standard -