Gil Givati gilgi@matrix.cofiles.meetup.com/20327604/Dynatrace_Meetup.pdf · Agenda •Intro •What...

Preview:

Citation preview

Gil Givati

gilgi@matrix.co.il

What gives me the right ??

• Programmer/DBA/IT Technologies knowledge since the last century • APM Specialist since the last decade

Agenda

• Intro • What is Dynatrace AppMon and how does it work • Metrics driven pipeline (or What don’t I know when testing my apps)

• Using AppMon for Memory and CPU analysis • Dynatrace AppMon and application Optimization

What does Dynatrace AppMon do?

3rd parties

Akamai

Cloudfront

Synthetic

Apache

IIS

Node.js

nginx

Java

.NET

PHP

IBM

WMQ

ESBs

MongoDB

Hbase

Cassandra

CICs

IMS

ORACLE

MSSQL

MySQL

DB2

Mobile

Collector

Plugins

Dynatrace Server

Hosts

Session Storage Performance Warehouse

Splunk

Elasticsearch

Solr

Rich Client

Web Interface

Web

Biz/Ops

Locate a User

Biz/Ops

Inspect Users

Biz/Ops

Dev

Biz/App

App/Ops

Dev/Arch

Method Level Hotspots

+ Exceptions, Logs, Memory Allocation, Threads, Actual Code ...

Perf Eng.

CPU metric without a local agent installed

Find out which operations are taking the most time

Perf Eng.

Retrieve Execution Plan from any SQL Statement

Find out if the performance of a SQL Statement could be

improved by using a different index

Perf Eng.

Dev/CI/CD

#1: Analyzing every Unit, Integration & REST API test

#2: Key Architectural Metrics for each test

#3: Detecting regression based on measure per Checkin

Build-by-Build Quality View

Quality Overview by Build In Dynatrace

Or pulled into your Build Server, e.g: Jenkins, Bamboo …

Dev/CI/CD

Biz/Ops Biz/App

Dev Dev/CI/CD

Export & Share

Biz/Ops Biz/App

Dev Dev/CI/CD

Demo Time

Metrics Driven pipeline

Early 2015: Monolith Under Pressure

Can‘t scale vertically endlessly!

May: 2.68s 94.09% CPU Bound

April: 0.52s

From Monolith to Services in a Hybrid-Cloud

Front End to Cloud

Scale Backend in Containers!

26.7s Load Time

5kB Payload

33! Service Calls

99kB - 3kB for each call!

171! Total SQL Count

Architecture Violation Direct access to DB from frontend service

Single search query end-to-end

The fixed end-to-end use case “Re-architect” vs. “Migrate” to Service-Orientation

2.5s (vs 26.7) 5kB

Payload

1! (vs 33!) Service Call

5kB (vs 99) Payload!

3! (vs 177) Total

SQL Count

From 0 to DevOps in 80 days Lessons learnt from shifting an on-prem to a cloud culture

Bernd Greifeneder, CTO

http://dynatrace.com/trial

Webinar: http://ow.ly/cEYo305kFEy Podcast: http://bit.ly/pureperf

2 major releases/year

customers deploy & operate on-prem

26 major releases/year

170 prod deployments/day self-service online sales SaaS & Managed

2011 2016

believe in the mission impossible

6 months major/minor release + intermediate fix-packs

+ weeks to months

rollout delay

sprint releases (continuous-delivery)

1h : code to production

„Always seek to Increase Flow“

„Understand and Respond to Outcome“

„Culture on Continual Experimentation“

„Always seek to Increase Flow“

Testing: Ensure Success in The First Way

Removing Bottlenecks

Eliminating Technical Debt

Enable Successful Cloud & Miroservices Migration

Shift-Left Quality

Reduce Code Complexity

It‘s not about blind automation of pushing more bad code on new stacks through a pipeline

It‘s not about blindly adding new features on top

of existing withouth measuring its success

You measure it! from Dev (to) Ops

35

What you currently measure

What you should measure

Quality Metrics

in your pipeline # Test Failures

Overall Duration

Execution Time per test

# calls to API

# executed SQL statements

# Web Service Calls

# JMS Messages

# Objects Allocated

# Exceptions

# Log Messages

# HTTP 4xx/5xx

Request/Response Size

Page Load/Rendering Time

Build 17 testNewsAlert OK

testSearch OK

Build # Use Case Stat # API Calls # SQL Payload CPU

1 5 2kb 70ms

1 3 5kb 120ms

Use Case Tests and Monitors Service & App Metrics

Build 26 testNewsAlert OK

testSearch OK

Build 25 testNewsAlert OK

testSearch OK

1 4 1kb 60ms

34 171 104kb 550ms

Ops

#ServInst Usage RT

1 0.5% 7.2s

1 63% 5.2s

1 4 1kb 60ms

2 3 10kb 150ms

1 0.6% 4.2s

5 75% 2.5s

Build 35 testNewsAlert -

testSearch OK

- - - -

2 3 10kb 150ms

- - -

8 80% 2.0s

Metrics from and for Dev(to)Ops

Re-architecture -> Performance Fixes

Scenario: Monolithic App with 2 Key Features

Reduce Lead Time: Stop 80% of Performance Issues in your Integration Phase

CI/CD: Test Automation (Selenium, Appium, Cucumber, Silk, ...) to detect functional and

architectural (performance, scalabilty) regressions

Perf: Performance Test (JMeter, LoadRunner, Neotys, Silk, ...) to

detect tough performance issues

your tool of choice

#SQL, #Threads, Bytes Sent, # Connections WPO Metrics, Objects Allocated, ...

Confidential, Dynatrace, LLC

https://dynatrace.github.io/ufo/

“In Your Face” Data!

Demo Time

Confidential, Dynatrace, LLC

Use application metrics as additional Quality Gates Dev&Test: Personal

License to Stop Bad Code when it gets created!

Tip: Dont leave your IDE!

Continuous Integration: Auto-Stop bad Builds based on AppMetrics from Unit-, Integration, - Perf Tests

Tip: integrate with Jenkins, Bamboo, ...

Prod: Monitor Usage and Runtime Behavior per Service, User Action,

Feature ... Tip: Stream to ELK, Splunk and Co ...

Automated Tests: Identify Non-Functional Problems by looking at App Metrics

Tip: Feed data back into your test tool!

Memory Diagnostics with AppMon

Main Pains for Problem Hunters

Where do millions of short living objects come from?

Component/Layer

Component/Layer

Component/Layer

How can single users create so many objects?

How can certain business transactions create so many objects?

Which collections are growing?

The JVM/CLR contains a lot of objects. How to analyze, especially in a load or

production environment?

What keeps my objects alive?

#1: Do we really have a memory related problem?

#3: Growing “Old Gen” is a good

indicator for a Mem Leak

#4: Heavy GC kicks in when Old Generation is full!

#5: Throughput of Application goes to 0

due to no High GC and resulting Out of

Memory

#1: Eden Space stays constant. Objects being propagated to

Survivor Space

#2: GC Activity in Young Generation ultimately moves objects into Old

Generation

http://apmblog.dynatrace.com/2014/11/25/finding-fixing-memory-leaks-tibco-business-works/

#2: Which classes are growing on the heap?

#3: Who is keeping these objects alive?

Keep Alive Direct Reference

Keep Alive will only show References

that are kept alive due to the referrer The referrer that is solely responsible

for the object not being Garbage Collected

visualize the real object hierarchy, which includes the referring property

5 3

4

2

1 1

2 3

4 5

For THIS object

This is the REFERRER

And this is the REFEREE

Response Time w & w/o Suspension

476ms

Garbage Collector

455ms

21ms

Dynatrace recognizes the suspension by the garbage collector…

…and knows precisely how long the PurePath was interrupted

Thus, we can calculate the duration of the PurePath without suspension

Monitor how GC runs impact PurePath duration

View suspension time in detail

Contribution of garbage collection to response time

Demo Time

Database Patterns: N+1, Long Running SQL, Chatty ...

Web Service Patterns: N+1, Chatty, Payload

Threading Patterns: # of Threads, Wait, Sync, ...

• Response Time: From Very Fast (< 100ms) to Very Slow (> 4s)

• State: No Errors, Errors, Transaction Failed

• Database: Chatty, Single Long SQL, N+1, Overall DB Time High

• Web Service: High Payload, Chatty, Slow, N+1, High Service Time

• Threading: Normal (1-3), Acceptable (3-6), Heavy (> 6)

• Complexity:

• Normal (< 3 Agents or < 100 Nodes) to

• Complex (> 10 Agents or 1000 Nodes)

• Content Type: Static, Dynamic

• HTTP Response Code: 4xx, 5xx

• Async: No Async, Normal, Heavy (> 10% Nodes executed async)

• Blog: http://apmblog.dynatrace.com/2016/06/23/automatic-problem-detection-with-dynatrace/

All Patterns

Demo Time

• YouTube Channel: http://bit.ly/dttutorials

• Dynatrace Personal: http://bit.ly/dtpersonal

• Contact: gilgi@matrix.co.il

• Share Your PurePath: http://bit.ly/sharepurepath

Reminders

Confidential, Dynatrace, LLC

Recommended