60
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive Makoto Shimura, Solutions Architect 2018/12/05 Amazon Athena [AWS Black Belt Online Seminar]

[AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar | AWS Black Belt Online Seminar

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Makoto Shimura, Solutions Architect

2018/12/05

Amazon Athena

[AWS Black Belt Online Seminar]

Page 2: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

⎼ Amazon Athena

⎼ AWS Glue

⎼ Amazon SageMaker

Page 3: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

AWS Black Belt Online Seminar

①吹き出しをクリック②質問を入力③ Sendをクリック

Twitter

#awsblackbelt

Page 4: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• 2018 12 05

AWS (http://aws.amazon.com)

• AWS

AWS

• AWS does not offer binding price quotes. AWS pricing is publicly available and is subject to

change in accordance with the AWS Customer Agreement available at

http://aws.amazon.com/agreement/. Any pricing information included in this document is provided

only as an estimate of usage charges for AWS services based on certain information that you

have provided. Monthly charges will be based on your actual use of AWS services, and may vary

from the estimates provided.

Page 5: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Agenda

• Amazon Athena

• Amazon Athena

• Update & Tips

Page 6: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

Amazon S3 SQL

Page 7: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

サーバレスでインフラ管理の必要なし

大規模データに対しても高速なクエリ

事前のデータロードなしに Amazon S3 に直接クエリ

スキャンしたデータに対しての従量課金

JDBC / ODBC / API 経由で BI ツールやシステムと連携

Page 8: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

DW

BI

Web

ETL

Page 9: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Presto:

• Amazon Athena

https://prestodb.io/overview.html

参考: Presto のアーキテクチャ

Page 10: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

AWS Glue Data Catalog

• Amazon Athena AWS Glue Data Catalog

• DB / Table / View / Partition

• Data Catalog Apache Hive Metastore

• AWS Glue Amazon Athena

Page 11: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

AWS Glue Data Catalog

• AWS Glue Amazon Athena

AWS Glue Data Catalog

• Amazon Athena

https://docs.aws.amazon.com/ja_jp/athen sa/latest/ug/glue-faq.html

Page 12: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

• AWS Glue S3 Crawler

• schema-on-read

CREATE EXTERNAL TABLE IF NOT EXISTS action_log (

user_id string,

action_category string,

action_detail string

year int,

month int,

)

PARTITIONED BY (year int, month int)

STORED AS PARQUET

LOCATION 's3://athena-examples/action-log/’

TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY');

Page 13: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

tinyint, smallint, int, bigint, boolean, float, double,

string, binary, timestamp, decimal, date, varchar, char

array<data_type>

map map<primitive_type, data_type>

struct * struct<col_name: data_type>

union UNIONTYPE<data_type, data_type…>

* https://aws.amazon.com/jp/blogs/big-data/create-tables-in-amazon-athena-from-nested-json-and-mappings-using-jsonserde/

Page 14: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

SerDe

CSV LazySimpleSerDe OpenCSVSerDe

TSV LazuSimpleSerDe ‘¥t’

LazuSimpleSerDe

JSON HiveJSONSerDe OpenXJsonSerDe

Apache Avro AvroSerDe

ORC ORCSerDe

Apache Parquet ParquetSerDe

Logstash Grok SerDe

Apache RegexSerDe

CloudTrail CloudTrailSerDe OpenXJSONSerDe

https://docs.aws.amazon.com/ja_jp/athena/latest/ug/supported-format.html

Page 15: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

SNAPPY Parquet

ZLIB ORC

GZIP 1GB

LZO

Page 16: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

• Presto ANSI SQL

• WITH Window JOIN

• Presto 0.172

[ WITH with_query [, ...] ]

SELECT [ ALL | DISTINCT ] select_expression [, ...]

[ FROM from_item [, ...] ]

[ WHERE condition ]

[ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]

[ HAVING condition ]

[ UNION [ ALL | DISTINCT ] union_query ]

[ ORDER BY expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...] ]

[ LIMIT [ count | ALL ] ]

Page 17: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena DDL /

• EXTERNAL TABLE S3 Athena

• 20 • DDL / 20

• VIEW CTAS

••

• UDF / UDAF

• DDL

http://docs.aws.amazon.com/athena/latest/ug/language-reference.html#unsupported-ddl

Page 18: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena API

• API 2

• API JDBC

• StartQueryExecution QueryExecutionId

• GetQueryExecution

• State SUCCEEDED GetQueryResults

Named Query API

BatchGetNamedQuery

CreatenamedQuery

DeleteNamedQuery

GetNamedQuery

ListNamedQueries

Query Execution API

BatchGetQueryExecution

GetQueryExecution

GetQueryResutls

ListQueryExecutions

StartQueryExecutions

StopQueryExecutions

Page 19: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Page 20: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Athena

• OLTP (Online Transactional Processing) OLAP (Online Analytical

Processing)

• ETL

• &

Page 21: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

OLAP

• WHERE GROUP BY

• WHERE SELECT

SELECT

col1

, col2

, COUNT(col3)

, SUM(col3)

FROM

table1

INNER JOIN table2

ON table1.id = table2.id

WHERE

table1.id = table2.id

AND col4 = 1

AND col5 = “good”

GROUP BY

col1

, col2

Page 22: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• S3 CREATE TABLE

• WHERE

• 1 1,000,000 AWS Glue Data Catalog Amazon Athena 20,000

CREATE EXTERNAL TABLE IF NOT EXISTS action_log (

user_id string,

action_category string,

year int,

month int,

day int

)

PARTITIONED BY (year int, month int, day int)

STORED AS PARQUET

LOCATION 's3://athena-examples/action-log/’

TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY');

s3://athena-examples/action-log/year=2017/month=03/day=01/data01.gz

Page 23: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• WHERE

• “year/month/day”

SELECT

month

, action_category

, action_detail

, COUNT(user_id)

FROM

action_log

WHERE

year = 2016

AND month >= 4

AND month < 7

GROUP BY

month

, action_category

, action_detail

以下の Amazon S3 パスだけが読み込まれるs3://athena-examples/action-log/year=2016/month=04/day=01/s3://athena-examples/action-log/year=2016/month=04/day=02/s3://athena-examples/action-log/year=2016/month=04/day=03/...s3://athena-examples/action-log/year=2016/month=07/day=31/

Page 24: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

(Hive )col1=val1/col2=val2/

• CREATE TABLE MSCK REPAIR

TABLE OK

• MSCK REPAIR

TABLE 1 OK

val1/val2/

• MSCK REPAIR TABLE ALTER

TABLE ADD PARTITION

ALTER TABLE ADD PARTITION

Page 25: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• OLAP

• ORC, Parquet

• 1

• OLTP

• TEXTFILE(CSV, TSV)

ORCのデータ構造https://orc.apache.org/docs/spec-intro.html

Page 26: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

1 2 3 4 5 6

1 2 3 4 5 6 1 2 3 4 5 6

列指向行指向

I/O の効率があがる• 圧縮と同時に使うことで I/O 効率がさらに向上• カラムごとに分けられてデータが並んでいる

• 同じカラムは,似たような中身のデータが続くため,圧縮効率がよくなる

1 2 3 4 5 61 2 3 4 5 6

a

列指向行指向

OLAP 系の分析クエリを効率的に実行できる

• たいていの分析クエリは,一度のクエリで一部のカラムしか使用しない

• 単純な統計データなら,メタデータで完結する

Page 27: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

&

• CREATE

• &

• SELECT

month

, action_category

, COUNT(action_category)

FROM

action_log

WHERE

year = 2016

AND month >= 4

AND month < 7

GROUP BY

month

, action_category

CREATE EXTERNAL TABLE IF NOT EXISTS action_log (

user_id string,

action_category string,

action_detail string

year int,

month int,

day int

)

PARTITIONED BY (year int, month int, day int)

STORED AS PARQUET

LOCATION 's3://athena-examples/action-log/’

TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY');

Page 28: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

CREATE EXTERNAL TABLE IF NOT EXISTS elb_logs_pq (

request_timestamp string,

elb_name string,

request_ip string,

request_port int,

user_agent string,

ssl_cipher string,

ssl_protocol string

)

PARTITIONED BY (year int, month int, day int)

STORED AS PARQUET

LOCATION 's3://athena-examples/elb/parquet/’

TBLPROPERTIES("parquet.compress"="SNAPPY");

CREATE EXTERNAL TABLE IF NOT EXISTS elb_logs_raw (

request_timestamp string,

elb_name string,

request_ip string,

request_port int,

backend_ip string,

user_agent string,

ssl_cipher string,

ssl_protocol string

)

PARTITIONED BY(year string, month string, day string)

ROW FORMAT SERDE

’org.apache.hadoop.hive.serde2.RegexSerDe’

WITH SERDEPROPERTIES (

'serialization.format' = '1',

'input.regex' = '([^ ]*) ([^ ]*) ([^ ]*):… ([A-Za-z0-9.-]*)$’

)

LOCATION 's3://athena-examples/elb/raw/';

Parquet & Snappy Regex & Raw file

Page 29: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

https://aws.amazon.com/jp/blogs/news/analyzing-data-in-s3-using-amazon-athena/

SELECT elb_name, uptime, downtime, cast(downtime as double)/cast(uptime as double) uptime_downtime_ratio

FROM (SELECT elb_name,

sum(case elb_response_code WHEN '200' THEN 1 ELSE 0 end) AS uptime,

sum(case elb_response_code WHEN '404' THEN 1 ELSE 0 end) AS downtime

FROM elb_logs_pq GROUP BY elb_name

)

S3

Regex & Raw file 1TB 236s 1.15TB $5.75

Parquet & Snappy 130GB 6.78s 2.51GB $0.013

/ 87% 34 99% 99.7%

Page 30: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena Top 10

(1)

1.

WHERE

2.

IO

3.

128 MB

4.

Parquet / ORC

5. ORDER BY

LIMIT ORDER BY

https://aws.amazon.com/jp/blogs/news/top-10-performance-tuning-tips-for-amazon-athena/

Page 31: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena Top 10

(2)

6. JOIN

7. GROUP BY

8. LIKE

LIKE RegEx

9.

COUNT DISTINCT APPROX_DISTINCT()

10.

* SELECT

https://aws.amazon.com/jp/blogs/news/top-10-performance-tuning-tips-for-amazon-athena/

Page 32: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

Page 33: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

CASE 1:

• Amazon S3 CSV

• CTAS Parquet

• VIEW

• Amazon QuickSight BI

S3 BI

Page 34: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

CTAS (create table as select)

• /

CREATE TABLE new_table WITH (

format = ‘PARQUET’,

external_location = ‘s3://my_athena_results/new_table/’

partitioned_by = ARRAY[‘year’, ‘month’, ‘day’],

bucketed_by = ARRAY[‘user_id’, ‘category’],

bucket_count = 3,

)

AS SELECT *

FROM old_table;

Page 35: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

CTAS

• ARRAY[‘key1’, ‘key2’, ‘key3’] s3://MYBUCKET/key1=xxx/key2=yyy/key3=zzz

• CTAS 100

• WHERE

• 128-512MB

Page 36: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

VIEW

• VIEW

• VIEW VIEW

• AWS Glue Data Catalog

CREATE VIEW my_view AS

SELECT

col1,

col2 / 100 AS col2_percent

FROM table1

INNER JOIN table2

ON table1.id = table2.id;

Page 37: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

ESRI Java Geometry Library

• point line multiline

• distance equals overlaps

• 2

ST_CONTAINS, ST_POINT

SELECT counties.name, COUNT(*) cnt

FROM counties

CROSS JOIN earthquakes

WHERE ST_CONTAINS (

counties.boundaryshape,

ST_POINT(earthquakes.longitude, earthquakes.latitude))

GROUP BY counties.name

ORDER BY cnt DESC

https://docs.aws.amazon.com/ja_jp/athena/latest/ug/geospatial-query-what-is.html

Page 38: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon QuickSight Amazon Athena

• Amazon QuickSight JDBC/ODBC GUI

Amazon Athena

• Amazon Athena

• Amazon Athena SPICE

Amazon Athena

https://docs.aws.amazon.com/ja_jp/quicksight/latest/user/create-a-data-set-athena.html

Page 39: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

JDBC / ODBC BI

• BI SQL

• IAM API

•IAM

athena:GetQueryResultsStream

• JDBC 2.0.5

• ODBC: 1.0.3

https://docs.aws.amazon.com/ja_jp/athena/latest/ug/athena-bi-tools-jdbc-odbc.html

URL: jdbc:awsathena://athena.${REGION}.amazonaws.com:443

Username: $AWS_ACCESS_KEY_IDPassword: $AWS_SECRET_ACCESS_KEY

Page 40: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

JDBC / ODBC MS Active Directory

• JDBC / ODBC Active Directory

IAM Athena

https://docs.aws.amazon.com/athena/latest/ug/access-federation-saml.html

URL: jdbc:awsathena://athena.${REGION}.amazonaws.com:443

AwsCredentialProviderClass: com.simba.athena.iamsupport.plugin.AdfsCredentialsProvider

Idp_host: example.adfs.server

Idp_prot: 223

UID: $YOUR_AD__UIDPWD: $YOUR_AD_PASSWORDPreferred_role: arn:aws:iam::$YOUR_AWS_ACCOUNT:role/$IAMUSER_NAME

Page 41: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

CASE 2: ETL Amazon Athena

• JSON

• CTAS Parquet

• AWS Glue Data Catalog

S3

C

A

B

A

B

C

Page 42: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

CTAS

Amazon Athena INSERT INTO

INSERT INTO

1. CREATE EXTERNAL TABLE s3://MY_BUCKET/service_a/ table_service_a

2. CTAS tmp_table_service_a

• external_location = s3://MY_BUCKET/service_a/year=2018/month=12/day=01

• Bucketed_by = ARRAY[‘high_cardinality_col’]

• bucket_count = XX

3. table_service_a ALTER TABLE ADD PARTITION

4. tmp_table_service_a DELETE TABLE

5. 2-4

Page 43: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

AWS Glue Data Catalog resource-based policy

• AWS Glue Data Catalog IAM

• Amazon Athena

• AWS Glue Data Catalog Amazon S3 Amazon S3

• S3 Amazon Athena Amazon S3 API

Page 44: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Case 3:

• Amazon Kinesis Data Firehose

• Amazon S3 Parquet

Page 45: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Kinesis Data Firehose Parquet

• S3 Amazon Kinesis Data

Firehose

• 2018/5/10 Parquet / OCR

• AWS Lambda

•Amazon Athena

Page 46: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

ALTER TABLE ADD PARTITION

• Amazon Athena

• 10

ALTER TABLE kdf_log ADD PARTITION (year=2018, month=12, day=5) LOCATION ‘s3://MY_KDF_BUCKET/2018/12/05/’;

ALTER TABLE kdf_log ADD PARTITION (year=2018, month=12, day=5) LOCATION ‘s3://MY_KDF_BUCKET/2018/12/06/’;

ALTER TABLE kdf_log ADD PARTITION (year=2018, month=12, day=5) LOCATION ‘s3://MY_KDF_BUCKET/2018/12/07/’;

...

ALTER TABLE kdf_log ADD PARTITION (year=2018, month=12, day=5) LOCATION ‘s3://MY_KDF_BUCKET/2025/12/31/’;

Page 47: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Update & Tips

Page 48: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Athena Workgroups

https://www.slideshare.net/AmazonWebServices/amazon-athena-whats-new-and-how-sendgrid-innovates-ant324-aws-reinvent-2018

Page 49: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Athena

• Athena

5

https://www.slideshare.net/AmazonWebServices/amazon-athena-whats-new-and-how-sendgrid-innovates-ant324-aws-reinvent-2018

Page 50: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• Amazon S3

• Amazon S3

• 3

• SSE-S3

• SSE-KMS

• CSE-KMS

https://docs.aws.amazon.com/ja_jp/athena/latest/ug/encryption.html

Page 51: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

AWS

Amazon Athena S3 AWS

• AWS CloudTrail

• Amazon CloudFront

• Elastic Load Balancing (ALB/CLB)

• Amazon VPC

• AWS Cost and Usage Reports

• AWS Systems Manager

https://docs.aws.amazon.com/ja_jp/athena/latest/ug/querying-AWS-service-logs.html

https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/athena.html

https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-inventory-query.html

Page 52: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena 1

• : 20 (DDL 20 )

• : 30

• 1 API

API 名 バースト時の値

BatchGetNamedQuery,BatchGetQueryExecution

ListNamedQueries, ListQueryExecutions 5 最大 10

CreateNamedQuery, DeleteNamedQuery, GetNamedQuery

StartQueryExecution, StopQueryExecution 5 最大 20

GetQueryExecution, GetQueryResults 25 最大 50

https://docs.aws.amazon.com/athena/latest/ug/service-limits.html

Page 53: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Glue 1

• : 10,000

• : 100,000

• : 1,000,000

• : 1,000,000

• : 10,000,000

Page 54: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• S3 1TB 5$

• 10MB 10MB

• MB

• S3

• DDL

• Amazon S3 S3

• Amazon S3 API S3

• AWS Glue Data Catalog

Page 55: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Amazon Athena

Page 56: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Page 57: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

• Amazon Athena Amazon S3 SQL

• AWS Glue Amazon Kinesis Data Firehose

Page 58: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

Q&A

AWS Japan Blog https://aws.amazon.com/jp/blogs/news/

Page 59: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

AWS Black Belt Online Seminar

12月開催予定

12/11 (火) 12:00-13:00 AWS Well-Architected ( )

12/18 (火) 12:00-13:00 Amazon Sumerian

12/19 (水) 18:00-19:00 AWS Certificate Manager

12/25 (火) 12:00-13:00 Amazon DynamoDB Advanced Design Pattern

申し込みhttps://amzn.to/JPWebinar

Page 60: [AWS Black Belt Online Seminar] Amazon Athena · © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Webinar  |  AWS Black Belt Online Seminar

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Webinar https://amzn.to/JPWebinar | https://amzn.to/JPArchive