Upload
interop-mumbai-2009
View
658
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Data Center and server consolidation, virtualization, cluster & grid computing, service oriented networking and green data centers represent the latest trends in the data center arena. However, the need for security, high availability & performance, and hosting of business critical applications still remain the same. The business drivers too are unchanged – reduce costs, maximize ROI, improve performance & manageability, provide more control & security of applications. Amidst all this, the focus on end-user experience of data center hosted applications is sometimes misplaced. The value of data centers and thereby IT still lies in successful delivery and continuous improvement in delivery of these business critical, data-center hosted applications. This session provides deep insight into how data center managers can baseline end user experience of data center hosted applications, alert on threshold breaches and arrive at RCA of the problem, evaluate opportunities for continuous improvement of end-user experience as well as analyze impact of changes related to infrastructure.
Citation preview
Fluke Networks Confidential – Do Not Distribute
Optimizing Application Delivery of Data-Center hosted applications
Fluke Networks Confidential – Do Not Distribute
Why is IT Infrastructure Management becoming increasingly important?
• IT is increasingly enabling business processes of enterprises in gaining competitive
advantage, through
– Improved Speed of transactions leading to satisfied users / customers
– Lowered Cost of transactions for the enterprise
– Reach of customers across geographies and time zones
– Offering improved and value-added services
• IT is becoming increasingly complex and business critical
– Applications are becoming more and more IP and web enabled
– Convergence of Voice, Data and Video on a common IP Infrastructure, for improved business collaboration
• ROI on investment in IT is being more demanded by Business Managers
– alignment of business and IT objectives is more imperative like never before
Fluke Networks Confidential – Do Not Distribute
Poor Application Performance affects businessSource: Aberdeen Research quoted in Network Magazine
1
Source: “Poor Application Performance Translates to Lost Revenue,” Network World, August 2008
Application Performance
Problems Due to . . .
60 % Inability to identify issues before
they impact end users
>50 % Increased application complexity
37 % Inability to measure SLA and
application performance
34 % Not testing application
performance
in pre-production stages
~33 % Increased network traffic
complexity, causing problems
with app performance mgmt
60 % Inability to identify issues before
they impact end users
>50 % Increased application complexity
37 % Inability to measure SLA and
application performance
34 % Not testing application
performance
in pre-production stages
~33 % Increased network traffic
complexity, causing problems
with app performance mgmt
Effects of Poor Application
Performance
58 % Reduced employee satisfaction
50 % Lost revenue opportunities
47 % Decreased responsiveness to
external customers’ needs
32 % Declined brand reputation
31 % Less effective IT staff
58 % Reduced employee satisfaction
50 % Lost revenue opportunities
47 % Decreased responsiveness to
external customers’ needs
32 % Declined brand reputation
31 % Less effective IT staff
Corporate Revenue Affected By Up to 9%
Fluke Networks Confidential – Do Not Distribute
2009: Follow on research
• Business is 93% more likely to report direct impact on
revenues from poor application performance than IT is
• IT does not seem to figure that application performance
degradation is that expensive!
• IT departments are facing a new challenge
– How to identify and resolve potential application problems
before they frustrate users by impeding their ability to get their
jobs done.
Fluke Networks Confidential – Do Not Distribute
Where is the divide?
• Managers of IT Infrastructure are typically tasked with managing the uptime
and availability of the IT Infrastructure.
– Focus is on lowering operational costs, build-in redundancies, simplify operations etc
– Performance yardstick is uptime or availability of infrastructure
– Often management of end-user experience is not taken into consideration
• Users of IT infrastructure use a different yardstick to evaluate the
performance of IT infrastructure – how well do applications respond over
the network ?
• Focus on monitoring and managing End-User Experience of applications is
key to delivering value of data center
Fluke Networks Confidential – Do Not Distribute
A Typical n-Tier, data-center hosted Business Application
HTTPS HTTP Oracle SQL
Challenges in Application Management:
• Many points of failure
• Many different protocols
• Geographically diverse
• Many groups are responsible for the infrastructure – network, application, server, client,
security
• May involve either custom or 3rd party applications
End UserRouter Firewall Load Balancer Switch Switch Switch
Internet or
Intranet
Web Servers Application
Servers
Database
Servers
DNS
Directory
Server
Fluke Networks Confidential – Do Not Distribute
WAN
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
WAN
How does a user transaction flow over the network?
Fluke Networks Confidential – Do Not Distribute
Tier 1 Tier 2 Tier 3 Tier 4 Tier 5
Tier 6
WAN
Fluke Networks Confidential – Do Not Distribute
3sec 100ms 75ms 25ms 2sec
150ms
WAN
Fluke Networks Confidential – Do Not Distribute
End-User Response Time (EURT) Insight
Fluke Networks Confidential – Do Not Distribute
What is a transaction?e.g. user buying a stock
Business
Transaction
User Action Application Transaction
A business transaction may
involve multiple user actions
Each user action may consist of
several application transactions
Purchase
100 shares
of Danaher
stock
Go To
Trade
Page
Look up
Danaher
Symbol
Enter
Symbol
and Qty
Submit
Order
POST /submit_order.asp
GET /stylesheet.css
GET /javascript.js
GET /logo.gif
GET /uparrow.gif
GET /dnarrow.gif
GET /border.gif
The business cares about the
application's ability to do thisUsers complain about one or
more of these things being slow
EURT monitoring needs these to be
Measured individually
Fluke Networks Confidential – Do Not Distribute
User Server
TCP SYN (80)
SYN ACK
ACK
HTTP GET #1
Response #1
ACK for Response #2
HTTP GET #2
Flow
Application Response Time
Initial Network Round Trip Time
Server Connect Time
Response #2
Response #3
Response #4
Response #5
ACK for Response #5
Transaction #1
Transaction #2
Data Transfer Time
Application Response Time
Data Transfer Time
Response #1
Response #2
Response #3
Response #4
ACK for Response #4 Network Round-Trip TimeFIN
FIN ACK
ACK
TCP Response Time Measurement
EURT = ART + DTT + NRT
EURT is the sum of ART, DTT and NRT
When EURT is high, look to the individual components to
see which may be the leading contributor.
Fluke Networks Confidential – Do Not Distribute
TCP Packets Web Packets DB PacketsTCP Packets Web Packets DB Packets
4 S
eco
nd
s
Client Web DB
Encompassed Transactions
Fluke Networks Confidential – Do Not Distribute
User Server
UDP Data Request
UDP Response #1
Application Response Time
UDP Response #2
UDP Response #3Data Transfer Time - Server
UDP Response Time Measurement
UDP Data Request Data Transfer Time - Client
Flow
Transaction #1
Transaction #2
Flow Terminates on a
Timeout
UDP Data Request
UDP Response #1
Application Response Time
UDP Response #2
UDP Response #3Data Transfer Time - Server
UDP Data Request Data Transfer Time - Client
Note that no NRT is
applicable because
UDP does not have
acknowledgements
Fluke Networks Confidential – Do Not Distribute
Summary• End-User Response Time is the sum of Application Response Time, Data
Transfer Time and Network Response Time across multiple tiers of application
• Focusing on EURT helps enhance end-user experience of n-tier applications
• Baselining, setting up thresholds, and investigating on threshold breaches
helps understand root cause of degraded end-user experience and Optimize
application delivery of data-center hosted applications
• When Infrastructure changes happen, IT managers can immediately pinpoint
what changed – ART, DTT or NRT and thereby understand the impact of
change
• EURT helps IT managers proactively manage IT Infrastructure, collaborate
across silos in the organization, make more informed investment decisions
into upgrading infrastructure.
Fluke Networks Confidential – Do Not Distribute
Back Slides
Fluke Networks Confidential – Do Not Distribute
User Server
TCP SYN (80)
SYN ACK
ACK
HTTP GET #1
Response #1
ACK for Response #2
HTTP GET #2
Initial Network Round Trip Time
Server Connect Time
Response #2
Response #3
Response #4
Response #5
ACK for Response #5
Response #1
Response #2
Response #3
Response #4
ACK for Response #4FIN
FIN ACK
ACK
Server Connect Time is the time it takes
the TCP stack in the server to respond
to a SYN request. It does not involve
the application software and should be
a very low value.
The Initial NRT is measured once per
flow. This is different than the NRT
metric which is measured once per
transaction.
If NRT cannot be measured for
whatever reason, the Initial NRT metric
becomes also the NRT value for those
transactions (plan B)
Fluke Networks Confidential – Do Not Distribute
User Server
TCP SYN (80)
SYN ACK
ACK
HTTP POST
Response #1
ACK for Response #2
HTTP GET #2
Response #2
Response #3
Response #4
Response #5
ACK for Response #5
Response #1
Response #2
Response #3
Response #4
ACK for Response #4FIN
FIN ACK
ACK
The Data Transfer Time (DTT) is
the time it takes for both the
client and server to send the
transaction request and the
transaction response
When DTT is high, this can be
due to several things;
- Chatty applications (many small packets)
- TCP Zero Window
- Retransmissions
- Out-of-orders packets
- Route changes
- Switch/route saturation or priority handling
- Etc.
Data Transfer Time - Server
HTTP POST Data Transfer Time - Client
Fluke Networks Confidential – Do Not Distribute
User Server
TCP SYN (80)
SYN ACK
ACK
HTTP GET #1
Response #1
ACK for Response #2
HTTP GET #2
Response #2
Response #3
Response #4
Response #5
ACK for Response #5
Response #1
Response #2
Response #3
Response #4
ACK for Response #4FIN
FIN ACK
ACK
Application Response Time
Application Response Time
(ART) is the time required for the
application server to begin
responding to a client’s request
When ART is high, it can be due to
server overload or excessive think time
for the application running on the
server. The server with high ART could
be waiting for another tier server to
return data.
Fluke Networks Confidential – Do Not Distribute
User Server
TCP SYN (80)
SYN ACK
ACK
HTTP GET #1
Response #1
ACK for Response #2
HTTP GET #2
Response #2
Response #3
Response #4
Response #5
ACK for Response #5
Response #1
Response #2
Response #3
Response #4
ACK for Response #4FIN
FIN ACK
ACK
Network Round-Trip Time
Network Round Trip Time (NRT)
is the time between the server
response and the receipt of the
acknowledgement from the client
When NRT is high, this can be due
network congestion, restrictive
bandwidth, packet loss