Geek Sync: A Lean Approach To Application Performance Monitoring

Preview:

Citation preview

A Lean Approach to MonitoringSeptember 15, 2015

About Ernest• Product Manager at IDERA in Austin, TX• 20 years of IT experience, from startups

to enterprise shops• Runs CloudAustin user

group, DevOpsDays Austin conference

• Twitter: @ernestmueller• Blog: theagileadmin.com

AgendaThe Monitoring Landscape

What Is Lean?

MVP Monitoring Areas

Next Steps

Monitoring Your Systems

First Topic SubcontentGoes Here

Monitoring Your Applications

Monitoring Tools• Network (SNMP, Netflow)• Server (SNMP, WMI, system)• Virtualization/Cloud/Container• Real User Monitoring (network, browser)• Service Endpoint (simple/transactional,

local/remote)• Application (management interface,

instrumentation)• Software metrics (database, web/app server)• Custom metrics (application)• Logging, Security, Analytics, Reporting, More…

What To Do?• Monitor it all?

– Expensive– Complex

• How deep?– Monitor parts of it?– Gaps in visibility– Which parts?

Monitoring Pitfalls• “I have 100,000 metrics, but still can’t tell if the

site is down?”• “Did you know we’re generating 30% of our

system load from monitoring?”• “It’s going to cost how much? Maybe, but the

procurement cycle will be 9 months…”• “We’re spending 2 headcount just on maintaining

our monitoring systems!”• We get so many alerts we need a secondary

triage system so we know which ones to pay attention to.”

What Is Lean?

• Eliminate Waste• Amplify Learning• Decide as late as possible• Deliver as fast as possible• Empower the team• Build quality in• See the whole

Lean Principles

Your Monitoring Is A Product

• Build – Minimum Viable Monitoring• Measure – All the Monitoring Points• Learn – About the App and the

Monitoring• Repeat – Go Deeper Where It’s Needed

Iterate Through A Development Cycle

Monitoring MVP Areas

1. Service Performance and Uptime2. Software Component Metrics3. System Metrics4. Application Metrics

What are the most important areas to cover?

Service Performance and Uptime

• Remember lean principle “see the whole”• “What do my users see?”• MVP: external synthetic probe of the end

service• Next: RUM, waterfalls, transactions• Later: transaction warehousing, cross-tier

transaction tracing

The end user view is always the most critical

Remember the Process

• Build – Minimum Viable Monitoring• Measure – All the Monitoring Points• Learn – About the App and the

Monitoring• Repeat – Go Deeper Where It’s Needed

Lean Development Cycle

Software Component Metrics

• “Is my service up?”• Check ports/processes for actionable outages• MVP: local probes• Next: More metrics beyond uptime and

response time (most have a set they expose)• Later: Advanced deep dive database and

other app component APM

What you can page people on

System and Network Metrics

• “What is the root cause?”• Load on your systems and network devices• MVP: basic system metrics

(CPU/mem/disk/network)• Next: More depth, cloud/virt/container layer

stats• Later: Netflow, deeper dive into specific

hardware platform metrics (SANs, etc.)

Diagnosing Issues

Application Metrics

• “What is really going on?”• The app knows, get the app to tell you• MVP: Logging and log aggregation• Later: Better logging• Next: Specific app metric emission,

application instrumentation (Management API or bytecode)

Business value and troubleshooting specifics

Think About The Principles

• Eliminate Waste• Amplify Learning• Decide as late as possible• Deliver as fast as possible• Empower the team• Build quality in• See the whole

Lean Principles

Quick Demo

• CopperEgg – Ultra quick-start SaaS-based monitoring with basics on systems, endpoints, RUM, custom

• Uptime – Download and install infrastructure and application monitoring

• Precise – APM suite with deep support from everything from SAP to Java to SQL

Monitor At the Right Depth

Questions?

Monitor the Lean way…