40
Monitoring in an IAC Age Monitoring in an IAC Age PuppetConf 2013 Kris Buytaert

Monitoring in an Infrastructure as Code Age

Embed Size (px)

DESCRIPTION

My PuppetConf 2013 Talk August 23, 2013 San Francisco

Citation preview

Page 1: Monitoring in an Infrastructure as Code Age

Monitoring in an IAC AgeMonitoring in an IAC Age

PuppetConf 2013

Kris Buytaert

Page 2: Monitoring in an Infrastructure as Code Age

Kris BuytaertKris Buytaert● I used to be a Dev,I used to be a Dev,● Then Became an OpThen Became an Op● Chief Trolling Officer and Open Source Chief Trolling Officer and Open Source

Consultant @inuits.euConsultant @inuits.eu● Everything is an effing DNS ProblemEverything is an effing DNS Problem● Building Clouds since before the bookstoreBuilding Clouds since before the bookstore● Some books, some papers, some blogsSome books, some papers, some blogs● Evangelizing devopsEvangelizing devops

Page 3: Monitoring in an Infrastructure as Code Age

devops = clamsdevops = clams● CultureCulture

● (Lean)(Lean)

● Automate all the things ... Automate all the things ...

• Build Automation Build Automation

• Test Automation Test Automation

• IACIAC

● Monitoring , Metrics ... Monitoring , Metrics ...

● SharingSharing

Page 4: Monitoring in an Infrastructure as Code Age

Monitoring is usually an Monitoring is usually an aftertoughtaftertought

ENOBUDGET, ENOTIMEENOBUDGET, ENOTIME

Page 5: Monitoring in an Infrastructure as Code Age

#monitoringsucks#monitoringsucks● John Vincent (@lusis)John Vincent (@lusis)

● A sub movement A sub movement

● https://github.com/monitoringsucks/https://github.com/monitoringsucks/

Page 6: Monitoring in an Infrastructure as Code Age

#monitoringlove#monitoringlove• #monitoringlove hacksessions#monitoringlove hacksessions

• #monitorama#monitorama

Page 7: Monitoring in an Infrastructure as Code Age

Infrastructure as CodeInfrastructure as Code● Model our infrastructure Model our infrastructure

● A fast reproducable platformA fast reproducable platform

● Disaster discovery for “free”Disaster discovery for “free”

Page 8: Monitoring in an Infrastructure as Code Age

For years we've tolerated humans to to make For years we've tolerated humans to to make structural manual changes to the infrastructure structural manual changes to the infrastructure our critical applications are running on.our critical applications are running on.

Whilst at the same time demanding those critical Whilst at the same time demanding those critical applications to go trough rigid test scenarios.applications to go trough rigid test scenarios.

Who let this happen ?Who let this happen ?

Page 9: Monitoring in an Infrastructure as Code Age

Infrastructure as CodeInfrastructure as Code● Code = Code Code = Code

● Version Control Version Control

● Quality ChecksQuality Checks

● TestingTesting

● Continuous IntegrationContinuous Integration

● Continous DeliveryContinous Delivery

Page 10: Monitoring in an Infrastructure as Code Age

Infrastructure as CodeInfrastructure as Code● Core Infrastructure Core Infrastructure

● Middleware deployment and Middleware deployment and integrationintegration

● Automated continuous application Automated continuous application deploymentdeployment

● Integrated Security enforcementIntegrated Security enforcement

● Host, Service and Application Host, Service and Application Monitoring configuredMonitoring configured

Page 11: Monitoring in an Infrastructure as Code Age

Why #monitoringsucksWhy #monitoringsucks● Manual config (gui)Manual config (gui)

● Not in sync with realityNot in sync with reality

● Hosts onlyHosts only

● Services sometimesServices sometimes

● Appliccation neverAppliccation never

● ChaosChaos

Page 12: Monitoring in an Infrastructure as Code Age

Let's forget aboutLet's forget about● Tools with no (stable) APITools with no (stable) API

● Tools with strong focus on GUITools with strong focus on GUI

● Unless you are an SME with < 100 nodesUnless you are an SME with < 100 nodes

● Zabixx, Zenoss, Hyperic, GroundWork, ....Zabixx, Zenoss, Hyperic, GroundWork, ....

Page 13: Monitoring in an Infrastructure as Code Age

Where to monitor ?Where to monitor ?● Dev Dev

● AcceptanceAcceptance

● ProdProd

Page 14: Monitoring in an Infrastructure as Code Age

What we wantWhat we want

● Small , wel suited componentsSmall , wel suited components

• CollectCollect

• Transport / MangleTransport / Mangle

• Analyse / ActAnalyse / Act

• VisualizeVisualize

Page 15: Monitoring in an Infrastructure as Code Age

Monitoring BaselineMonitoring Baseline● Deploy a host,Deploy a host,

● Add it to the monitoringAdd it to the monitoring

● Add collection toolsAdd collection tools

● Add check definitionsAdd check definitions

● Update the monitoring tool configUpdate the monitoring tool config

Page 16: Monitoring in an Infrastructure as Code Age
Page 17: Monitoring in an Infrastructure as Code Age

Apache Example:Apache Example:

Page 18: Monitoring in an Infrastructure as Code Age

Icinga ?Icinga ?• Isn't nagios dead ?Isn't nagios dead ?

• Vibrant CommunityVibrant Community

• Throw great parties in NurnbergThrow great parties in Nurnberg

• Nobody can pronounce it anyhowNobody can pronounce it anyhow

• https://github.com/Inuits/puppet-icinga/https://github.com/Inuits/puppet-icinga/

Page 19: Monitoring in an Infrastructure as Code Age

Stored ConfigsStored Configs

Page 20: Monitoring in an Infrastructure as Code Age

Collection and ExportCollection and Export

Export :Export :

@@resource { @@resource {

... }... }

Collect:Collect:

Resource <<| Resource <<| query |>>query |>>

Clean out nodes that dissapearClean out nodes that dissapear

puppet node clean puppet node clean

Page 21: Monitoring in an Infrastructure as Code Age

Exporting and Collecting Exporting and Collecting

Page 22: Monitoring in an Infrastructure as Code Age

Monitoring a VhostMonitoring a Vhost

Page 23: Monitoring in an Infrastructure as Code Age

● AutodetectionAutodetection

● MultiplexingMultiplexing

● Trend ForecastingTrend Forecasting

I love CheckMKI love CheckMK

Page 24: Monitoring in an Infrastructure as Code Age

• Autodetection ?Autodetection ?

• Service,Service,

• FunctionalitiesFunctionalities

• eg. vhosts etceg. vhosts etc

• Single Source of TruthSingle Source of Truth

I hate CheckMKI hate CheckMK

Page 25: Monitoring in an Infrastructure as Code Age

Monitoring a service vs Monitoring a serviceMonitoring a service vs Monitoring a service

Page 26: Monitoring in an Infrastructure as Code Age

Definition of Done:Definition of Done:

monitored and in productionmonitored and in production

Page 27: Monitoring in an Infrastructure as Code Age

A software project is not done A software project is not done untill your last end user is deaduntill your last end user is dead

Page 28: Monitoring in an Infrastructure as Code Age

Exit DODExit DOD

Measure Application UsageMeasure Application Usage

Page 29: Monitoring in an Infrastructure as Code Age

But , err how do I ? But , err how do I ?

Page 30: Monitoring in an Infrastructure as Code Age

Culture, Culture,

Automation,Automation,

Measurement :Measurement :

measure all the thingsmeasure all the thingsSharingSharing

Page 31: Monitoring in an Infrastructure as Code Age

Deploy StatisticsDeploy Statistics● Time To DeployTime To Deploy

● Deploy Deploy FrequencyFrequency

● Lifecycle Lifecycle frequencyfrequency

● Map to Map to

Page 32: Monitoring in an Infrastructure as Code Age

Application MetricsApplication Metrics● Number of current usersNumber of current users

● Number of sign upsNumber of sign ups

● Response timesResponse times

● Troughput Troughput

● XYZ UsageXYZ Usage

● # restarts# restarts

● Insert your specific valuable stuff Insert your specific valuable stuff here.here.

Page 33: Monitoring in an Infrastructure as Code Age

Graphite APIGraphite API

Page 34: Monitoring in an Infrastructure as Code Age

Triggers on GraphsTriggers on Graphs● Export Java MetricsExport Java Metrics

● JMXTransJMXTrans

● Export JMXConfigsExport JMXConfigs

● Configure NRPE CheckConfigure NRPE Check

● Export NagiosCheckExport NagiosCheck

● Collect JMX Exports on Collect JMX Exports on JMXTransNodeJMXTransNode

● Graph EmGraph Em

Collect Nagios Configs Collect Nagios Configs on Nagios Serveron Nagios Server

Page 35: Monitoring in an Infrastructure as Code Age

Triggers on GraphsTriggers on Graphs

Page 36: Monitoring in an Infrastructure as Code Age

Triggers on GraphsTriggers on Graphs

Page 37: Monitoring in an Infrastructure as Code Age

Self ServiceSelf ServiceGdash based pipelinesGdash based pipelines

Puppetized Templates (wip)Puppetized Templates (wip)

Page 38: Monitoring in an Infrastructure as Code Age

Up Next: Up Next:

• Creating Information out of this data Creating Information out of this data

• Big dataBig data

• Machine LearningMachine Learning

Page 39: Monitoring in an Infrastructure as Code Age

HomeworkHomeworkSkylineSkyline

OculusOculus

Dusk Dusk

RiemannRiemann

EsperEsper

Puppetdb external Puppetdb external NaginatorNaginator

Page 40: Monitoring in an Infrastructure as Code Age

[email protected]@inuits.eu

Further ReadingFurther Reading@krisbuytaert @krisbuytaert http://www.krisbuytaert.be/blog/http://www.krisbuytaert.be/blog/http://www.inuits.eu/http://www.inuits.eu/

InuitsInuits

Duboistraat 50Duboistraat 502060 Antwerpen2060 AntwerpenBelgiumBelgium891.514.231891.514.231

+32 475 961221+32 475 961221