50
/ Monitoring is easy; why do we suck at it? monitoring it all Tuesday, November 8, 2011

Monitoring is easy, why are we so bad at it presentation

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Monitoring is easy, why are we so bad at it  presentation

/

Monitoring is easy;why do we suck at it?

monitoring it all

Tuesday, November 8, 2011

Page 2: Monitoring is easy, why are we so bad at it  presentation

Author of “Scalable Internet Architectures”Pearson, ISBN: 067232699X

Contributor to “Web Operations”O’Reilly, ISBN: 978-1-4493-7744-1

Founder of OmniTI, Message Systems, Fontdeck, & CirconusI like to tackle problems that are “always on” and “always growing.”

I am an EngineerA practitioner of academic computing.IEEE member and Senior ACM member.On the Editorial Board of ACM’s Queue magazine.

Who is this guy? @postwait

Tuesday, November 8, 2011

Page 3: Monitoring is easy, why are we so bad at it  presentation

Monitoring: let’s start with a definition.

• analytics

• trending

• fault-detection / alerting

• capacity planning

• it is the collection and use of telemetry data

Tuesday, November 8, 2011

Page 4: Monitoring is easy, why are we so bad at it  presentation

What monitoring is not

• controls

• via a monitoring you observe,you do not influence

Tuesday, November 8, 2011

Page 5: Monitoring is easy, why are we so bad at it  presentation

So why do we suck at it?

tl;drbecause we think about

• networks,

• systems, and

• applications

instead of what matters: business.

Tuesday, November 8, 2011

Page 6: Monitoring is easy, why are we so bad at it  presentation

Your purpose

• Your purpose is to makeyour company’s web businessoperate.

(hence: “web operations”)

Tuesday, November 8, 2011

Page 7: Monitoring is easy, why are we so bad at it  presentation

Your purpose

• Your purpose is to makeyour company’s web businessoperate.

(hence: “web operations”)

Tuesday, November 8, 2011

Page 8: Monitoring is easy, why are we so bad at it  presentation

Your purpose

• ensure business success

Tuesday, November 8, 2011

Page 9: Monitoring is easy, why are we so bad at it  presentation

Understanding your purpose

• who defines business success?

• shareholders, ultimately

• the board of directors, in their stead

• the CEO on an operational, day-to-day basis

Tuesday, November 8, 2011

Page 10: Monitoring is easy, why are we so bad at it  presentation

Understanding your purpose

• Assuming your CEO is doing a good job

• the executive team understands these metrics

• Assuming the executive team is competent

• their reports understand these metrics(at least the pertinent ones)

Tuesday, November 8, 2011

Page 11: Monitoring is easy, why are we so bad at it  presentation

Pertinent == Problematic

• You enable all aspects of the business

• All these metrics are pertinent

Tuesday, November 8, 2011

Page 12: Monitoring is easy, why are we so bad at it  presentation

But why?

• You could simply track stuff that is in your purview.

• Why not?

Tuesday, November 8, 2011

Page 13: Monitoring is easy, why are we so bad at it  presentation

Technology

• As a technology operations group,you have the technology.

We can rebuild him.We have the technology.We can make him better than he was.Better...stronger...faster.

- Oscar Goldman

Tuesday, November 8, 2011

Page 14: Monitoring is easy, why are we so bad at it  presentation

Why is our technology better?

• Simply put: MTTD

Tuesday, November 8, 2011

Page 15: Monitoring is easy, why are we so bad at it  presentation

Now, what about your purview?

• Obviously monitoring the business is useful.

• However, you cannot directly affect business.

• You indirectly affect it by operating the web portion.

Tuesday, November 8, 2011

Page 16: Monitoring is easy, why are we so bad at it  presentation

What can you change?

• You can control:

• releases,

• performance,

• stability,

• computing resources,

• networking,

• and availability.

Tuesday, November 8, 2011

Page 17: Monitoring is easy, why are we so bad at it  presentation

Visualize!

• All this information must be presented visually.

Tuesday, November 8, 2011

Page 18: Monitoring is easy, why are we so bad at it  presentation

Text.

• Text is incredibly useful.

• Consider: deployment.

Tuesday, November 8, 2011

Page 19: Monitoring is easy, why are we so bad at it  presentation

Code Deployment

r82394 (by corey) 1h 7m 9s ago previous deploy 1h 42m 18s ago

11 deploys today

Tuesday, November 8, 2011

Page 20: Monitoring is easy, why are we so bad at it  presentation

Code Deployment

r82394 15:03:14 2011/06/15 previous deploy 1h 42m 18s ago

11 deploys today

Tuesday, November 8, 2011

Page 21: Monitoring is easy, why are we so bad at it  presentation

Code Deployment

r82394 (by corey) 1h 7m 9s ago previous deploy 1h 42m 18s ago

11 deploys today

Tuesday, November 8, 2011

Page 22: Monitoring is easy, why are we so bad at it  presentation

Code Deployment

r82394 (by corey) 1h 7m 9s ago previous deploy 1h 42m 18s ago

11 deploys today

Tuesday, November 8, 2011

Page 23: Monitoring is easy, why are we so bad at it  presentation

Code Deployment

r82394 (by corey) 1h 7m 9s ago previous deploy 1h 42m 18s ago

11 deploys today

Tuesday, November 8, 2011

Page 24: Monitoring is easy, why are we so bad at it  presentation

Code Deployment

r82394 (by corey) 1h 7m 9s ago previous deploy 1h 42m 18s ago

11 deploys today

Tuesday, November 8, 2011

Page 25: Monitoring is easy, why are we so bad at it  presentation

Text.

• Numbers are trickier.

• So many representations from which to choose.

Tuesday, November 8, 2011

Page 26: Monitoring is easy, why are we so bad at it  presentation

Beware

Tuesday, November 8, 2011

Page 27: Monitoring is easy, why are we so bad at it  presentation

Beware

Tuesday, November 8, 2011

Page 28: Monitoring is easy, why are we so bad at it  presentation

Beware

Tuesday, November 8, 2011

Page 29: Monitoring is easy, why are we so bad at it  presentation

Beware

Tuesday, November 8, 2011

Page 30: Monitoring is easy, why are we so bad at it  presentation

Gauges require understanding

• Gauges imply a deep understanding of

• bounds, and

• tolerances

Tuesday, November 8, 2011

Page 31: Monitoring is easy, why are we so bad at it  presentation

Gauges require understanding

• General advice

• If the range will ever change, don’t use gauges

Tuesday, November 8, 2011

Page 32: Monitoring is easy, why are we so bad at it  presentation

Gauges require understanding

• Great for:

• percentages,

• temperature,

• power per rack,

• bandwidth per uplink

Tuesday, November 8, 2011

Page 33: Monitoring is easy, why are we so bad at it  presentation

Gauges require understanding

• Bad for:

• IOPS,

• current visitor counts,

• requests per second,

• bandwidth overall

Tuesday, November 8, 2011

Page 34: Monitoring is easy, why are we so bad at it  presentation

Graphs are often better

Tuesday, November 8, 2011

Page 35: Monitoring is easy, why are we so bad at it  presentation

Even little ones

Tuesday, November 8, 2011

Page 36: Monitoring is easy, why are we so bad at it  presentation

Think relatively

Tuesday, November 8, 2011

Page 37: Monitoring is easy, why are we so bad at it  presentation

Think relatively

xxxxxxxxxxxxxxx

xxxxxxxxxxxxxxx

Tuesday, November 8, 2011

Page 38: Monitoring is easy, why are we so bad at it  presentation

Users live all around the world

• Users live just about everywhere

• “Where?” is a useful question

Tuesday, November 8, 2011

Page 39: Monitoring is easy, why are we so bad at it  presentation

Geolocation

Tuesday, November 8, 2011

Page 40: Monitoring is easy, why are we so bad at it  presentation

Geolocation is interesting

• to marketing

• to legal

• (okay to everyone)

• but, not so useful to operations

Tuesday, November 8, 2011

Page 41: Monitoring is easy, why are we so bad at it  presentation

Geolocation is interesting

• perhaps more interesting

Tuesday, November 8, 2011

Page 42: Monitoring is easy, why are we so bad at it  presentation

Geolocation is interesting

Tuesday, November 8, 2011

Page 43: Monitoring is easy, why are we so bad at it  presentation

Geolocation

• Internet location != geo-political location

Tuesday, November 8, 2011

Page 44: Monitoring is easy, why are we so bad at it  presentation

ASN location

• The closest thing to geo-political boundaries is peering

-bash-4.0$ /usr/sbin/bgpctl show rib 66.78.236.243flags: * = Valid, > = Selected, I = via IBGP, A = Announcedorigin: i = IGP, e = EGP, ? = Incomplete

flags destination gateway lpref med aspath origin 66.78.236.0/22 64.202.119.7 100 0 23352 4436 2914 3356 32778 i

### ASN 327778 is “Smart City Networks, L.P.”

Tuesday, November 8, 2011

Page 45: Monitoring is easy, why are we so bad at it  presentation

ASN location

Tuesday, November 8, 2011

Page 46: Monitoring is easy, why are we so bad at it  presentation

What about the business?

Tuesday, November 8, 2011

Page 47: Monitoring is easy, why are we so bad at it  presentation

What about the business?

Authorizations : Hard Failed : Soft Failed : Releases

Tuesday, November 8, 2011

Page 48: Monitoring is easy, why are we so bad at it  presentation

Is that all?

• Hells no.

Tuesday, November 8, 2011

Page 49: Monitoring is easy, why are we so bad at it  presentation

It’s all about real-time

• Everything so far is old hat (maybe)

• Every business unit has visualizations like this

• You need to combine the data

• You need to make it real-time

Tuesday, November 8, 2011

Page 50: Monitoring is easy, why are we so bad at it  presentation

Thanks

• web demo ensues....

Tuesday, November 8, 2011