18
Monitoring of SmartNews 2016/03/24 GREE Tech Talk #10

Monitoring of SmartNews

Embed Size (px)

Citation preview

Monitoringof

SmartNews2016/03/24 GREE Tech Talk #10

Self Introduction

• Nobutoshi Ogata

• Manager, Site Reliability Engineering

• @nobu666

• ❤ Whiskey, Cat, Heavy Metal

• Entrusted dev.(10y) ➡ GREE infrastructure devision(3y) ➡ Some startup(1y) ➡ SmartNews(2015/05-)

SmartNews

16,000,000+downloadsworldwide

Before Datadog• We used:

• munin

• growthforecast

• cloudwatch

• Wanted to centralized management !

After Datadog - Phase1• OK, we can manage centrally

• But...?

• We're respecting the free development of engineers !

• Problem that the monitoring setting is leaked out "

Phase2• Introduce Interferon

• Datadog DSL

• Well, we can monitor all resources automatically

• But...?

• Unmaintained in active !

• Can't feel free to mute from Web UI "

• Lack of flexibility #

Phase3• Integrated itamae

• Our engineers were used to write chef

• Easy to override default settings

• It's asynchronous. Feel free to mute from Web UI

• Integrated dogaws @takus

• Yet another Datadog CloudWatch Integragion

• We are used in combination with itamae

Datadog tips• Event collect and easy overlay

• Provisioning

• Deploy

• etc

Datadog tips• Easiness anomary detection

• Can't compared over 24hours until quite recently

• We request to be able to compare more longer period. Thank Datadog for implementing !

• This is a closed function. If you want to use it, ask Datadog support "

For example• Comapare Kinesis records count EWMA

pct_change(median(last_1h),1w_ago):ewma_20(avg:aws.kinesis.incoming_records{env:production,cost:smartnews} by {name}) > 50

• Compare application warn logchange(median(last_1h),1w_ago): sum:app.log.warn{env:production} by {autoscaling_group} > 25

Talk more?

• Join our free lunch in Tokyo office !

• Ask me later "

We're hiring!Only two people on Site Reliability Engineering Team !

• スマニューのSite Reliability Engineer募集!

• http://about.smartnews.com/en/careers/