Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Finding a home for your data in your serverless app
Marcilio Mendonca
S V S 2 2 3 - R 1
Sr. Solutions Developer
Amazon Web Services
Why are we here today?Why are we here today?
1) To discuss popular AWS database options for serverless applications
2) To discuss best practices for interacting with various databases from a serverless application
Agenda
• Amazon Relational Database Service (Amazon RDS)
• Amazon Aurora
• Amazon Aurora Serverless and the Data API
• Amazon DynamoDB
• Amazon Simple Storage Service (Amazon S3)
• Other data store options
Amazon RDS
• Relational databases (SQL)
• Cost-efficient and resizable capacity
• Automates hardware provisioning, database setup, patching and backups
• Supports: MySQL, PostgreSQL, MariaDB, Oracle, MS SQL Server
Amazon Aurora
• Relational database (SQL)
• MySQL and PostgreSQL-compatible
• Automates hardware provisioning, database setup, patching and backups
• Security, availability and reliability of commercial databases at 1/10th the cost
Amazon Aurora Serverless
• Same as Amazon Aurora but with no need to manage DB servers
• Simple, cost-effective option for infrequent, intermittent, or unpredictable workloads
• Is supported by the Data API for SQL
Amazon S3
• Object store service
• Massive scale
• Low latency, high throughput
• 99.999999999% durability, 99.99% availability
Amazon DynamoDB
• Key/value store
• Massive scale (horizontal)
• Schemaless (hash and sort keys)
• Serverless
• API-driven
• Multi-region (global tables)
Factors to consider when choosing a database technology
Must support SQL
• Amazon RDS, Amazon Aurora, Amazon Aurora Serverless, Amazon S3 (Athena, S3 Select)
Don’t want to change my existing SQL applications
• Amazon RDS, Amazon Aurora, Amazon Aurora Serverless
Need massive scale (hundreds of thousands of req/s) and/or a flexible schema
• Amazon DynamoDB, Amazon S3
Need to store and handle arbitrary objects (e.g., large media files)
• Amazon S3
Factors to consider when choosing a database technology
Need to do event-driven data processing
• Amazon DynamoDB, Amazon S3
Don’t want to manage servers!
• Amazon Aurora Serverless, Amazon DynamoDB
I’m expecting infrequent, intermittent or unpredictable workload
• Amazon Aurora Serverless, Amazon DynamoDB
I don’t want to have to deal with database connections
• Amazon Aurora Serverless, Amazon DynamoDB
For further info on database options, please check:
FSI309 - Relational databases: Performance, scale and availability
DAT301 - Data modeling with Amazon DynamoDB in 60 minutes
DAT334 - Advanced design patterns for Amazon DynamoDB
DAT202 – What’s new in Amazon Aurora
DAT355 - How to choose between Amazon Aurora MySQL and PostgreSQL
DAT309 - Amazon Aurora storage demystified: How it all works
DAT402 - Going deep on Amazon Aurora Serverless
DAT207 – What’s new in Amazon RDS
DAT316 - MySQL options on AWS: Self-managed, managed and serverless
DAT317 - PostgreSQL options on AWS: Self-managed, managed and serverless
DAT336 - Process data using cloud databases and serverless technologies
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
import my_package...
# Initialization codemy_var_1 = ”hello"my_var_2 = ”world"...
# Execution entry pointdef lambda_handler(event, context):
try:# Your code here
except Exception as e:logger.error(f”Oops, something went wrong: {e}")raise e
...
Lambda Code: Initialization vs. Entry Point
Executed once per execution environment provisioning
Executed once per request
Amazon RDS: Connect to the database from AWS Lambdaimport pymysql...
# RDS Settingsrds_host = "payrolldbinstance.c0zzsas12345.us-east-1.rds.amazonaws.com"username = 'admin'password = '12345678’db_name = 'PayrollDB'db_port = 3306...
# Execution entry pointdef lambda_handler(event, context):
try:conn = pymysql.connect(rds_host, user=username,
passwd=password, db=db_name, connect_timeout=5, port=db_port)except pymysql.MySQLError as e:
logger.error(f"Could not connect to MySQL instance: {e}")raise e
...
import pymysql...
# RDS Settingsrds_host = "payrolldbinstance.c0zzsas12345.us-east-1.rds.amazonaws.com"username = 'admin'password = '12345678’ <== NEVER DO THIS!db_name = 'PayrollDB'db_port = 3306...
def lambda_handler(event, context):try:
conn = pymysql.connect(rds_host, user=username,passwd=password, db=db_name, connect_timeout=5, port=db_port)
except pymysql.MySQLError as e:logger.error(f"Could not connect to MySQL instance: {e}")raise e
...
Amazon RDS: No explicit password in your code!
Amazon RDS: Reuse the DB connections
...
try:conn = pymysql.connect(db_endpoint, user=username,
passwd=password, db=db_name, connect_timeout=5, port=db_port)except pymysql.MySQLError as e:
logger.error(f”Could not connect to MySQL instance: {e}")raise e
def lambda_handler(event, context):with conn.cursor() as cur:...
(reused across invocations)
Amazon RDS: Consider using “lazy instantiation”
# module-level database connection objectconn = None
# returns existing database connection or creates one if neededdef get_database_connection():
global connif conn == None:
conn = pymysql.connect(...)return conn
def input_is_valid(event):...
def lambda_handler(event, context):
if (input_is_valid(event)):conn = get_database_connection()...
import functools
@singletondef get_database_connection():
conn = pymysql.connect(...)return conn
def input_is_valid(event):...
def lambda_handler(event, context):
if (input_is_valid(event)):conn = get_database_connection()...
Amazon RDS: Consider using “singleton” connections
def singleton(func) :singleton_obj = [email protected](func)def wrapper(*args, **kwargs):
nonlocal singleton_objif singleton_obj == None:
singleton_obj = func()return singleton_obj
return wrapper
Amazon RDS: Don’t hardcode database parametersimport os
...db_endpoint = os.environ['DB_ENDPOINT']db_port = int(os.environ['DB_PORT'])db_name = os.environ['DB_NAME']db_employee_table = os.environ['DB_EMPLOYEE_TABLE’]db_username = os.environ[‘DB_USERNAME']db_password = os.environ[‘DB_PASSWORD']
try:conn = pymysql.connect(db_endpoint, user=db_username,
passwd=db_password, db=db_name, connect_timeout=5, port=db_port)except pymysql.MySQLError as e:
logger.error(f”Could not connect to MySQL instance: {e}")raise e
def lambda_handler(event, context):with conn.cursor() as cur:...
Amazon RDS: Store DB credentials on AWS Secrets Managerimport boto3...# Secrets Manager Settingssecret_name = "ExampleDBCredentials”. # You might also fetch this from an env. variablesecrets_client = boto3.client('secretsmanager')
# Get DB credentials from AWS Secrets Managertry:
secret_response = secrets_client.get_secret_value(SecretId=secret_name)except Exception as e:
raise eelse:
secret_string = json.loads(secret_response["SecretString"])db_username = secret_string["user"]db_password = secret_string[“password”]
# Initialize the Database Connection (runs only once when Lambda env is provisioned)try:
conn = pymysql.connect(rds_host, user=db_username,passwd=db_password, db=db_name, connect_timeout=5, port=db_port)
...
Amazon RDS: Running DDL and DML statements
def lambda_handler(event, context):with conn.cursor() as cur:
try:# Creates the Employee Tablecur.execute(f"CREATE TABLE IF NOT EXISTS {db_employee_table}
(EmpID int NOT NULL, Name varchar(255) NOT NULL, PRIMARY KEY (EmpID))")# Inserts Employee Dataemp_id = random.randint(1,100000)cur.execute(f"INSERT INTO {db_employee_table} (EmpID, Name) VALUES (%s, %s)",
(emp_id, f"Employee-{emp_id}"))except:
conn.rollback()raise
else:conn.commit()# Traverses all Employeescur.execute(f"SELECT * FROM {db_employee_table}")rows = cur.fetchall()return format(rows)
1
4
2
3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Aurora: Connect to the database cluster
• You can connect to an Aurora DB cluster using the same tools that you use to connect to a MySQL or PostgreSQL database
• There are writer and reader (1+ read replicas) endpoints
Amazon Aurora: Use the Writer endpoint...db_w_endpoint = os.environ[’DB_W_ENDPOINT’]...try:
wconn = pymysql.connect(db_w_endpoint, user=db_username,passwd=db_password, db=db_name, connect_timeout=5, port=db_port)
except pymysql.MySQLError as e:logger.error(f"Could not connect to MySQL instance: {e}")raise e
def lambda_handler(event, context):with wconn.cursor() as wcur:
try:# Creates the Employee Tablewcur.execute(f"CREATE TABLE IF NOT EXISTS {db_employee_table}
(EmpID int NOT NULL, Name varchar(255) NOT NULL, PRIMARY KEY (EmpID))")# Inserts Employee Dataemp_id = random.randint(1,100000)wcur.execute(f"INSERT INTO {db_employee_table} (EmpID, Name)
VALUES (%s, %s)", (emp_id, f"Employee-{emp_id}"))except:
wconn.rollback()raise
else:wconn.commit()
Amazon Aurora: Use the Reader endpoint
...db_r_endpoint = os.environ['DB_R_ENDPOINT’]...try:
rconn = pymysql.connect(db_r_endpoint, user=db_username,passwd=db_password, db=db_name, connect_timeout=5, port=db_port)
except pymysql.MySQLError as e:logger.error(f"Could not connect to MySQL instance: {e}")raise e
def lambda_handler(event, context):with rconn.cursor() as rcur:
try:# Traverses all Employeesrcur.execute(f"SELECT * FROM {db_employee_table}")rows = rcur.fetchall()return format(rows)
except:raise
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The Data API for Aurora ServerlessInteract w/an Aurora Serverless cluster using an API!
https://aws.amazon.com/blogs/database/using-the-data-api-to-interact-with-an-amazon-aurora-serverless-mysql-database/
Amazon Aurora Serverless and the Data API
• Use the AWS CLI
• Or the various AWS SDKs
• Node.js, Java, Python
• PHP, Go, Javascript
• .NET, Ruby, C++
deprecated
Amazon Aurora Serverless and the Data APIPython SDK (boto3)
import boto3
rds_client = boto3.client('rds-data')
cluster_arn = 'arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster'secret_arn = 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret'
response = rds_client.execute_statement(resourceArn = cluster_arn,secretArn = secret_arn,database = 'mydb',sql = 'select * from employees limit 3')
print (response['records'])
Amazon Aurora Serverless and the Data APIWrap execute_statement()
def execute_statement(sql, sql_parameters=[], transaction_id=None):parameters = {
'secretArn': db_credentials_secrets_store_arn,'database': database_name,'resourceArn': db_cluster_arn,'sql': sql,'parameters': sql_parameters
}if transaction_id is not None:
parameters['transactionId'] = transaction_idresponse = rds_client.execute_statement(**parameters)return response
response = execute_statement('select * from package')
Amazon Aurora Serverless and the Data APIParameterized statements (prevent SQL injection attacks)
sql = 'select * from package where package_name=:package_name'package_name = ’Python'sql_parameters = [{'name':'package_name', 'value':{'stringValue': f'{package_name}'}}]response = execute_statement(sql, sql_parameters)print(response['records'])
sql = 'insert into package (package_name, package_version)values (:package_name, :package_version)'
sql_parameters = [{'name':'package_name', 'value':{'stringValue': ’Python'}},{'name':'package_version', 'value':{'stringValue': ’3.7.0'}}
]response = execute_statement(sql, sql_parameters)print(f'Number of records updated: {response["numberOfRecordsUpdated"]}')
Query data
Insert data
Amazon Aurora Serverless and the Data APIWrap batch_execute_statement()
def batch_execute_statement(sql, sql_parameter_sets):response = rds_client.batch_execute_statement(
secretArn=db_credentials_secrets_store_arn,database=database_name,resourceArn=db_cluster_arn,sql=sql,parameterSets=sql_parameter_sets
)return response
Amazon Aurora Serverless and the Data APIBatching inserts
sql = 'insert into package (package_name, package_version)values (:package_name, :package_version)'
sql_parameter_sets = []for i in range(1,101):
entry = [{'name':'package_name', 'value':{'stringValue': f'package{i}'}},{'name':'package_version', 'value':{'stringValue': ’v1.0'}}
]sql_parameter_sets.append(entry)
response = batch_execute_statement(sql, sql_parameter_sets)print(f'Number of records updated: {len(response["updateResults"])}')
transaction = rds_client.begin_transaction( secretArn=db_credentials_secrets_store_arn,resourceArn=db_cluster_arn, database=database_name)
try:sql = 'insert into package (package_name, package_version) values (:package_name, :package_version)'sql_parameter_sets = []for i in range(package_start_idx,package_end_idx):
entry = [ {'name':'package_name', 'value':{'stringValue': f'package-{i}'}},{'name':'package_version', 'value':{'stringValue': 'version-1'}}]
sql_parameter_sets.append(entry)response = batch_execute_statement(sql, sql_parameter_sets, transaction['transactionId'])
except Exception as e:transaction_response = rds_client.rollback_transaction(
secretArn=db_credentials_secrets_store_arn,resourceArn=db_cluster_arn,transactionId=transaction['transactionId’])raise
else:transaction_response = rds_client.commit_transaction(
secretArn=db_credentials_secrets_store_arn,resourceArn=db_cluster_arn,transactionId=transaction['transactionId'])
Amazon Aurora Serverless and the Data APIHandling transactions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon DynamoDB: Create tableimport boto3...
# DynamoDB high-level clientddb_client = boto3.resource('dynamodb’)# Read table name from environmentddb_payroll_tablename = os.environ['DDB_TABLE_NAME']
def create_payroll_table():logger.info(f'Creating table {ddb_payroll_tablename}...')table = ddb_client.create_table(
TableName=ddb_payroll_tablename,KeySchema=[{'AttributeName': 'emp_id', 'KeyType': 'HASH'}],AttributeDefinitions=[{'AttributeName': 'emp_id', 'AttributeType': 'S'}],ProvisionedThroughput={'ReadCapacityUnits': 5,'WriteCapacityUnits': 5}
)# Wait until the table exists.table.meta.client.get_waiter('table_exists').wait(TableName=ddb_payroll_tablename)return table.creation_date_time
def lambda_handler(event, context):create_payroll_table()...
... # Payroll table objectddb_payroll_table = ddb_client.Table(ddb_payroll_tablename)
def add_employee(emp_id, first_name, last_name):logger.info(f'Adding employee {emp_id} to table {ddb_payroll_tablename}...')ddb_payroll_table.put_item(
Item={ 'emp_id': emp_id, 'first_name': first_name, 'last_name': last_name })
def get_employee(emp_id):logger.info(f'Searching employee {emp_id} on table {ddb_payroll_tablename}...')response = ddb_payroll_table.get_item(Key={'emp_id': emp_id})return response['Item']
def lambda_handler(event, context):add_employee('12345678', 'John', 'Smith')print(get_employee('12345678'))
Amazon DynamoDB: Put and get item
import boto3...
from aws_xray_sdk.core import xray_recorderfrom aws_xray_sdk.core import patch_all
# Only instrument libraries if not running locallyif "AWS_SAM_LOCAL" not in os.environ:
patch_all()
...ddb_client = boto3.resource('dynamodb', endpoint_url=ddb_endpoint_url)...
def lambda_handler(event, context):// Your code here
Amazon DynamoDB and other DBs: Tracing with AWS X-Ray
Example: Tracing DynamoDB PutItem calls (sequential vs. batch)
278 ms 1,132 ms
27.8 ms 113.2 ms
Amazon DynamoDB: Running locallyhttps://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBLocal.html
# start DynamoDB locally$ docker run -p 8000:8000 amazon/dynamodb-local# env variable we set and use in Lambda code when developing locally
$ export DDB_ENDPOINT_URL=“http://localhost:8000” # if running as a Python script$ export DDB_ENDPOINT_URL=“http://docker.for.mac.localhost:8000” # sam local invoke or mac
# Read table name from environmentddb_payroll_tablename = os.environ['DDB_TABLE_NAME']# Set the endpoint for local testing (eg. http://localhost:8000), unset for productionddb_endpoint_url = os.getenv('DDB_ENDPOINT_URL', None)# DynamoDB high-level client (for low-level client use boto3.client('dynamodb') instead)ddb_client = boto3.resource('dynamodb', endpoint_url=ddb_endpoint_url)# Payroll table objectddb_payroll_table = ddb_client.Table(ddb_payroll_tablename)...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon S3: Real-time image resizing example
Amazon S3: Configure Amazon S3 events using AWS SAM
AWSTemplateFormatVersion: '2010-09-09'Transform: AWS::Serverless-2016-10-31Resources:CreateThumbnail:Type: AWS::Serverless::FunctionProperties:Handler: example-s3-resize-image.lambda_handlerRuntime: python3.6Timeout: 60Policies: AWSLambdaExecuteEvents:ResizeImageEvent:Type: S3Properties:
Bucket: !Ref SrcBucketEvents: s3:ObjectCreated:*
Amazon S3: Resize images stored in Amazon S3
s3_client = boto3.client('s3')
def resize_image(image_path, resized_path):with Image.open(image_path) as image:
image.thumbnail(tuple(x / 2 for x in image.size))image.save(resized_path)
def lambda_handler(event, context):for record in event['Records']:
bucket = record['s3']['bucket']['name']key = unquote_plus(record['s3']['object']['key'])download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)upload_path = '/tmp/resized-{}'.format(key)s3_client.download_file(bucket, key, download_path)resize_image(download_path, upload_path)s3_client.upload_file(upload_path, '{}resized'.format(bucket), key)
https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example-deployment-pkg.html
AWS Bookstore Demo App
A simple bookstore serverless application making use of various purpose-built databases on AWS
• https://github.com/aws-samples/aws-bookstore-demo-app
re:Invent 2018: Databases on AWS: The Right Tool for the Right Job
• https://www.youtube.com/watch?v=-pb-DkD6cWg
• https://aws.amazon.com/blogs/database/building-a-modern-application-with-purpose-built-aws-databases/
I’d like to invite you to join me in one of these hands-on sessions:
SVS333-R - Build serverless APIs supported by Amazon Aurora Serverless and the Data API
• Wednesday, Dec 4, 4:00 PM - 5:00 PM – Mirage, Events Center C1 - Table 6
• Thursday, Dec 5, 3:15 PM - 4:15 PM – Aria, Level 1 West, Bristlecone 4 - Table 10
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Free, on-demand courses on serverless, including
Visit the Learning Library at https://aws.training
Additional digital and classroom trainings cover modern application development and computing
Learn serverless with AWS Training and CertificationResources created by the experts at AWS to help you learn modern application development
• Introduction to Serverless
Development
• Getting into the Serverless
Mindset
• AWS Lambda Foundations
• Amazon API Gateway for
Serverless Applications
• Amazon DynamoDB for Serverless
Architectures
Thank you!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Marcilio Mendonca
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon RDS: Create secret on AWS Secrets Manager
• Create a DB user/pass for your application in the database (e.g., MySQL)
• Add the DB user credentials to Secrets Manager
aws secretsmanager create-secret --name Apps/Payroll/--secret-string"{'user’:’payroll_app','password’: xxxxxxxx'}"
Amazon RDS: IAM database authenticationhttps://aws.amazon.com/premiumsupport/knowledge-center/users-connect-rds-iam/
Check the DB engines that support IAM database authentication
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"rds-db:connect"
],
"Resource": [
"arn:aws:rds-db:us-east-1:1234567890:
dbuser:db-ABCDEFGHIJKL01234/PayrollApp"
]
}
]
}
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html#UsingWithRDS.IAMDBAuth.Availability
Amazon RDS: IAM database authenticationhttps://aws.amazon.com/premiumsupport/knowledge-center/users-connect-rds-iam/
Check the DB engines that support IAM database authentication
# RDS Settings fetched from Environment Variablesdb_endpoint = os.environ['DB_ENDPOINT']db_port = int(os.environ['DB_PORT'])db_name = os.environ['DB_NAME']db_username = os.environ['DB_USERNAME']db_employee_table = os.environ['DB_EMPLOYEE_TABLE']
# Use IAM Credentials to Connect to the MySQL Databaserds_client = boto3.client('rds')db_token = rds_client.generate_db_auth_token(db_endpoint, db_port, db_username)try:
conn = pymysql.connect(db_endpoint, user=db_username, passwd=db_token,db=db_name, connect_timeout=5, port=db_port)
except pymysql.MySQLError as e:logger.error(f"Could not connect to MySQL instance: {e}")raise e
Amazon RDS: IAM database authentication
• AWS recommends the following when using the MySQL engine:
• Use IAM database authentication as a mechanism for temporary, personal access to databases
• Use IAM database authentication only for workloads that can be easily retried
• Don't use IAM database authentication if your application requires more than 256 new connections per second
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html#
UsingWithRDS.IAMDBAuth.Availability
Amazon Aurora: Set up user/pass upon cluster provisioningE.g., PostgreSQL quick start (AWS CloudFormation)https://github.com/aws-samples/aws-aurora-cloudformation-samples/blob/master/cftemplates/Aurora-Postgres-DB-Cluster.yml
Amazon Aurora: Single endpoint for writers and readers?
Amazon DynamoDB: Use AWS SAM to create simple tables
https://github.com/awslabs/serverless-application-model
https://aws.amazon.com/serverless/sam/
DynamoDBEmployeeTable:Type: AWS::Serverless::SimpleTableProperties:
TableName: EmployeePrimaryKey:
Name: emp_idType: String
ProvisionedThroughput:ReadCapacityUnits: 5WriteCapacityUnits: 5
Tags:Department: EngineeringAppType: Serverless
SSESpecification:SSEEnabled: true
Other database options for your serverless application
Amazon ElastiCache
• Managed, Redis or Memcached-
compatible in-memory data store
• Serverless apps can cache data for faster
lookups (e.g., online auctions)
Amazon Redshift
• Fully managed, petabyte-scale data warehouse service in the cloud
• Serverless apps can be used to populate data into Amazon Redshift clusters and to query and aggregate data into reports
Amazon Elasticsearch Service (Amazon ES)
• Fully managed service to deploy, secure,
and operate Amazon ES at scale
• Serverless apps can be used to ingest
data into Amazon ES clusters (e.g., from
Amazon S3)
Amazon Neptune
• Fast, reliable, fully managed graph database service
• Build serverless applications that make use of highly connected datasets