Upload
hp-software-solutions
View
8.738
Download
10
Embed Size (px)
DESCRIPTION
If you’re a network administrator currently using or planning to use HP Network Node Manager i (NNMi) with the iSPIs for Multicast, IP Telephony (Avaya and Cisco) and Quality Assurance (Cisco IP SLA based), then attend this session. We’ll talk about best practices and use cases associated with each of these iSPI products and give you a detailed look into the challenges we’ve encountered deploying iSPIs at customer sites. We’ll also present various approaches we’ve used to overcome these challenges. The session will also highlight ways in which the HP NNMi and iSPI solution can enable you to quickly pinpoint root causes of network faults and to identify affected regions and services accurately within the SLA time limits. We’ll illustrate our points by refering to specific solutions we’ve developed for our customers.
Citation preview
1 ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
Best Practices for Managing your Network with Network Node Manager iSPIs
Larry Besaw
Damian Horner
Dominic Ruffatto
HP Software Network Management Center
2
Deployment Architecture
Agenda
Best Practices
Use Cases
Introduction to version 9.0 NNM iSPIs
Additional Resources
3
Introduction
– Subtitle goes here
4
Metrics Traffic Quality Assurance
• Diverse array of out of the box reports on device component and interface KPIs
• Customized collection and reporting with NNMi CustomPoll
• Network path health
• Collection and presentation of NetFlowand sFlow Traffic information
• Application to traffic flow mapping
• High scale distributed collection architecture
NNM Performance iSPIsPerformance monitoring in the context of your network
• Discovery and reporting on IPSLA, Disman tests
• Threshold alerts generated and node status updated
• Site modeling toguarantee availability
5
NNM iSPIs for advanced services
• Consistent presentation of IP Telephony, MPLS and IP Multicast information in the context of your network topology
• Easy association of network faults to interruption in advanced services • Threshold based performance monitoring helps ensure service qualityand availability
• End-to-end performance monitoring of infrastructure used for servicedelivery
NNMi
iSPI IP Telephony iSPI MPLS iSPI Multicast
6
HP NNM iSPI for IP Telephony
– Subtitle goes here
7
Deployment Architecture
• NNM iSPI for IPT must co-exist
on the same server as NNMi
• Requires NNM iSPI Performance
for Metrics license for reports
• NNM iSPI Performance for
Metrics may coexist on the
NNMi server or be deployed on
a dedicated server
Dedicated server NPS deployment
NPS = Network Performance Server
8
Deployment Architecture
• NNM iSPI for QA must co-exist
with NNMi and NNM iSPI for
IPT on the same server
• Requires NNM iSPI Performance
for Metrics license for reporting
capabilities
• NNM iSPI Performance for
Metrics may coexist or be
deployed on a dedicated server
Integration with iSPI Performance for Quality Assurance (QA)
NPS = Network Performance Server
9
Pre-install and install check-list
Installation
• Refer to the NNM iSPI for IPT support matrix for hardware sizing guidelines
• Install NNMi first
• Create the Web Server Client user in NNMi for NNM iSPI for IPT
• NNM iSPI for IPT utilizes the same database as NNMi
• Do *not* modify JNDI port during installation
• Select ‘isSecure’ for secure communication
10
Install time and post-install checklist
Installation
• To utilize the IPT voice path metric data in NNM iSPIPerformance for QA reports, keep the NPS server name, port and domain handy
• When prompted, say ‘Yes’
• Firewalls: Refer to the nms-ipt.ports.properties file to determine the firewall ports to open
• Start the IPT jboss(ovstart –c iptjboss)
11
Post-install checklist – Cisco CDR/CMR reports
Before Discovery and Monitoring
• Navigate to the NNM iSPI for IPT configuration UI
• Configure ‘AXL Access’ in ‘Data Access Configuration’
• Start the ‘CDROnDemand’ Webservice on the CM
• Configure ‘CDR Repository server’ and ‘SOAP username/passwd’ in CDR Access
• Enable Cisco reports from ‘Reporting configuration’
• Set the Voice quality metrics (Jitter, Latency, MOS etc) thresholds
12
Post-install checklist – Avaya CDR and SNMP settings
Before Discovery and Monitoring
• Navigate to the NNM iSPI for IPT configuration UI
• Configure ‘Avaya CDR Access’ in ‘Data Access Configuration’
• These are ‘Survivable CDRs’
• ‘sftp’ user name and password required
• For Customized format of CDR file, configure the format in: CustomizedCDRFormat.properties
• Do *not* touch StandardCDRFormat.properties
• Timezone is important too
• Enable the Avaya reports from ‘Reporting configuration’
13
Post-install checklist – Avaya SNMP
Before Discovery and Monitoring
• Configure SNMP ‘Regions’ for Avaya s8700 CM pairs
• SNMP timeout – 59 sec and retry -1
• Minimum security level –SNMPv1 Or SNMPv3
• NNM iSPI for IPT specific node discovery based on on-demand config poll of NNMi node Or on a scheduled config poll of NNMi node
14
Voice quality monitoring (MOS/QoS)
Users Complain: Voice Is Not Clear
Challenge Encountered:• Dial-tone is there, Call is established but
voice is not clear/broken between the two IP phones
Analysis:• From the CDR (Call delivery records) of the
CUCM, it seems that the Mean Opinion Score (MOS) has gone below the threshold value of 5 (configured by user in the SPI UI)
Solution:
• LowQoS call incidents gets generated–one from each side of the call
• Launching the voice path between the two phones/extensions, helps in troubleshooting the issue; exposing the call path between two phones
• User can also launch the NNM iSPIPerformance for QA reports of RTT/Jitter/MOS etc within the voice path between the phones
• Path health report can also be launched to observe relevant metrics across the network path
15
Tracking call processing activities
Voice Calls Get ‘Network Busy’ or ‘No Ring Tone’
Challenge Encountered:• Voice calls to a site get ‘network
busy’ or ‘no ringtone’ indications
Analysis:• Processor ‘System management’
occupancy breached the configured threshold
• Similarly, ‘Idle Management’ and ‘Total call attempts’ threshold violations
Solution:• ‘System management’ occupancy
threshold violation incident gets generated
• Drill down the incident and go to the source object (CM) and look at the occupancy tab for configured threshold values and measured values
16
Tracking Port network resource utilization
Voice Quality Degraded for Certain Calls
Challenge Encountered:• User wishes to troubleshoot why the
Voice quality for certain calls is not proper
• Voice gets broken or takes time to reach the called party
Analysis:• Avaya Port Network Load like
Outgoing Trunk load, Incoming trunk load, Tandem Loan etc increasing
Solution:• Port Network Trunk Load breached
threshold’ incident generated
• Drill down the incident to know actual threshold violation measurement values
• Open the Port Network detail form from this incident and analyze what metrics would have caused this behaviour and decide on capacity planning or optimization or PN resources
17
HP NNM iSPI Performance for Quality Assurance
– Subtitle goes here
18
Deployment Architecture
• NNM iSPI Performance for QA
must co-exist with NNMi on the
same server
• NPS can be on the same or
dedicated server
• Does not require a NNM iSPI
Performance for Metrics license
Dedicated server NPS deployment
NPS = Network Performance Server
19
Pre-install and install check-list
Installation
• Refer to the NNM iSPI Performance for QA support matrix for hardware sizing guidelines
• NNM iSPI Performance for QA must co-exist on the same server as NNMi
• Install NNMi first
• Create the Web Server Client user in NNMi for the NNM iSPI Performance for QA
• NNM iSPI Performance for QA utilizes the same database as NNMi
• Do *not* modify JNDI port during installation
• Select ‘isSecure’ for secured communication
• Firewalls: Refer to the nms-qa.ports.properties file to determine the firewall ports to be opened
• Start the NNM iSPI Performance for QA with jboss (ovstart –c qaboss)
20
Post-install checks
Before Discovery and Monitoring
• Configure Sites within the NNM iSPI Performance for QA first
• Re-compute the ProbeSiteassociations
• Configure Site-wide thresholds
• Supported services are: UDP-Jitter, UDP Echo, ICMP Echo, TCP Connect, VoIP
• Supported metrics are: RTT (ms and µs), +ve/-ve One-way/Two-way jitter and PacketLoss, MOS
• RTT (ms) and RTT (µs): Classification is made based on ‘precision’ of the SLA test
21
Post-install checks
Before Discovery and Monitoring
• Export the Site and Threshold configuration
• User can edit the exported xml file
• Import it for later use
• Command line tools available for export/import
• $NnmBinDir/nmsqasiteconfigutil.ovpl
• $NnmBinDir/nmsqathresholdconfigutil.ovpl
• Reference pages for more details
• Sub-minute test frequency is supported
• No support for bucketing
• Each on-demand or scheduled configuration poll of NNMi node initiates
the test discovery
22
Jitter threshold monitoring
Audio/Video Signal Quality Issues
Challenge Encountered:• Deteriorating video and voice quality at certain
sites
Analysis:• Network is facing a lot of jitter issues
Solution:• Incident for ‘High Two way jitter’ threshold
violation gets generated – probe status marked ‘Major’
• Drill down to the ‘QA Probe’ from this incident showing exact measurement value violating the configured threshold
• Look at the Threshold state tab to know threshold configuration details
• Look at the jitter configuration for VoIP and UDP-Jitter tests
• Totally, available jitter thresholds are: +ve/-veOne-way (S->D and D->S) and two-way jitter
• Drill down to NNM iSPI Performance for Metrics and NNM iSPI Performance for Traffic to understand potential congestion in the Audio/Video network path
23
HP NNM iSPI for IP Multicast
– Subtitle goes here
24
Deployment Architecture
• NNM iSPI for IP Multicast
must co-exist with NNMi on
the same server
• Requires the NNM iSPI
Performance for Metrics to
generate and view reports
Dedicated server NPS deployment
25
Deployment Architecture
• Must co-exist with NNMi on the
same server
• Requires the NNM iSPI
Performance for Metrics to
generate and view reports
Same server NPS deployment
26
Pre-install and install check-list
Installation
• Refer to the NNM iSPI for IP Multicast support matrix for hardware sizing guidelines
• NNM iSPI for IP Multicast must co-exist on the same server as NNMi
• Install NNMi first
• Create the Web Server Client user in NNMi for NNM iSPI for IP Multicast
• NNM iSPI for IP Multicast utilizes the same database as NNMi
• Do *not* modify JNDI port during installation
• Click ‘isSecure’ for secured communication
• Firewalls: Refer to the $nms-multicast.ports.properties file to determine the firewall ports to be opened
• Start the MC jboss (ovstart –c mcastboss)
27
Group discovery and performance polling
Before Discovery and Monitoring
• Configure Multicast group
address in DNS
• Better understanding of the
streams when seen as the
‘Names’
• Reserved groups (for PIM
internals) discovery – keep
them filtered if not needed
• Performance polling
frequency to be set as: 15,
30, 60 mins to see relevant
performance graphs
28
Multicast capable nodes in the network
Before Discovery and Monitoring
• Each on-demand or scheduled configuration poll of NNMi node initiates
the MC node discovery
• By default, Multicast-enabled routers only get into inventory of MC SPI
• To know all the Multicast-capable routers (both MC-enabled and MC-
disabled nodes) in your network, create the NNMi node group with:
‘capability = com.hp.nnm.capability.multicast.node.support’
• Look at this list and subtract the list of nodes in MC Nodes inventory
• You get the routers that are capable of running Multicast services
• Also helps in identifying routers which should be MC-enabled and they
are NOT
29
PIM neighbor adjacency monitoring
Detect PIM Adjacency Loss
Challenge Encountered:• User wishes to detect the PIM adjacency
loss in the IP Multicast network
Analysis:• PIM neighbor seems invalid
Solution:• Incident for ‘PIM interface critical’
generated
• Drill down and find out why the PIM neighbor is invalid
• Is it because ‘Hosted Node’ is not in Multicast topology/PIM neighbor is out of service?
• Also, launch the Multicast neighbor view for a particular node and look at the MC Iface In/Out utilization
• For the particular interface that connects to the LAN segment, one can also look at the Designated Router and the ifAliasto find out where this link is connected
30
Live Webcast video/stream issues
Video not Reaching Beyond Certain Router
Challenge Encountered:• Webcast video not reaching to users
beyond a certain site/router
Analysis:• A particular router might have all the PIM
interfaces in Critical/Not in service state resulting in router being not able to handle the Multicast service
Solution:• Incident for ‘All PIM ifaces in node are not
normal’• Drill down and find out from the MC router
the state of PIM interfaces
• Look at the ‘Groups’ tab in the MC router detail form and find out the flow it is participating in
• Launch the Forwarding tree for this flow and see the impacted regions behind the router
31
Outcomes
• Deploy the iSPI for Metrics/NPS for
IPT and IP Multicast SPIs on a
dedicated server
• Use these SPIs to effectively
manage your entire network from a
single pane of glass
32
Additional Resources
• Visit your one stop for detailed industry
information on HP’s Automated Network
Management Solution:
http://www.hp.com/go/anm
• Visit the new Network Management Center blog
for discussion topics and the latest news from the
Network Management Center:
www.hp.com/go/NNMblog
• Visit the new Network Management Center
product pages at
www.hp.com/go/nmc
33
Q&A
34 ©2010 Hewlett-Packard Development Company, L.P.
To learn more on this topic, and to connect with your peers after
the conference, visit the HP Software Solutions Community:
www.hp.com/go/swcommunity
35
36
Additional Use Cases
37
Quickly debug ‘No dial-tone’ issues
Phone Is Dead
Challenge Encountered:• IP Phone dial-tone missing and
phone is dead
Analysis:• IP phone seems to have got
unregistered from the assigned call controller/manager
Solution:• IP Phone unregistered incident
is generated• Drill down from an incident to
the phone and launch the control path that shows that IP phone is unregistered because connected switch interface is shutdown
• Hence, no dial-tone
38
H.323 IP Trunk monitoring
Unable to Place a Call Across Sites
Challenge Encountered:• User in California (Head-quarters)
can not call a user in Texas (branch office)
Analysis:• Inter-Cluster Trunks (ICT) that
facilitates least cost inter-site calls (via Gatekeeper) has the registration state of unregistered/rejected
Solution:• Incident for GK-controlled ICT
registration state change• Drill down to the incident and the
ICT shows that H.323 endpoints (call managers) have a count of 0 for the controlling gatekeeper (GK) device
• Hence, no remote site connectivity
39
Voice interface monitoring
Cannot Call Anyone Outside of Office?
Challenge Encountered:• Calls from office to out side of
office are not possible
Analysis:• Voice gateway that carried the
calls to PSTN/outside has the voice interface operationally down
Solution:• Gateway circuit switch interface
operation down incident gets generated
• Drill down through the interface to know the oper state of each of underlying channel to see why interface state is down
40
Capacity planning and optimization for Voice gateway usage
Huge Bills from Provider
Challenge Encountered:• All my calls are routing through PSTN
gateway• I am keeps on buying new circuits from the
provider
Analysis:• The T1/E1 interfaces are either under
utilized or over utilized
Solution:• An incident gets generated when a usage
state of the Ckt Switched iface changes• Drill down to the incident takes you to the
Ckt Switch channels for the Ckt switch interface
• Drill down to the voice gateway device which shows you oper, reg and usage state of the voice interfaces
• This usage state helps in knowing whether optimal utilization of voice interfaces happens in the network or not
41
Challenge Encountered:• User wants to know the reason why
branch office people are not able to call up the central office
Analysis:• SRST router seems to be active due to
IP WAN link failure between central office and branch office
Solution:• SRST active incident is generated • Drill down to this Critical incident and
look at the extensions that are registered to the SRST
SRST Active/Standby Monitoring
Cannot Call up the Head Office?
42
Challenge Encountered:• User wants to analyze the top-10
termination reason for phone calls in last one day
Solution:• Launch the Cisco CDR Top-N report
and generate the reports Grouped by Calling party number, Called party number, Cluster ID, Termination reason
Top-N call report
Analyze Termination Reasons for the Calls
43
Troubleshoot call carrying issues over a period of time
Analyze Calls Through a Certain IC Trunk
Challenge Encountered:• User wants to analyze the issues
in carrying the call via a certain IC Trunk for last one day
Solution:• Launch the Cisco CDR Top-N
report and generate the reports Grouped by Calling party number, Called party number, Cluster ID, Termination reason, Inbound IC Trunk
• For a designated IC Trunk, topology filter can also be set
44
MOS factor trend
Analyze Voice Quality Trends
Challenge Encountered:• User want to analyze the voice
quality issues occurring over a period of time
Solution:• Launch the Cisco CDR Top-N
report for MOS (Mean Opinion Score)
45
Heat Chart
Analyze the Overall Heat in the VoIP Network
Challenge Encountered:• User wants to analyze the number
of calls and their duration for a period of one week
Solution:• Launch the Cisco CDR Heat chart
report to see the behavior of overall VoIP network
46
For duplex deployment of Avaya s8700 pair
Primary Sever Redundancy
Challenge Encountered:• Avaya paired serves require
state monitoring
Solution:• Avaya paired device state
change incident is generated
• Drill down the incident and
look at the source object
detail form and look at the
‘Duplicated server’ which
should be ‘Active’
47
Quickly debug dial-tone issues
Phone Is Dead
Challenge Encountered:• IP Phone dial-tone missing and phone
is dead
Analysis:• IP phone seems to have got
unregistered from the assigned call controller/manager
Solution:• IP Phone unregistered incident is
generated• Drill down from an incident to the
phone and launch the control path that shows that IP phone is unregistered because connected L3-switch iface is not reachable
• One can also look at the CLAN to which the phone is associated and the IP Server Interface that connects the Port Network/G650 gateway to Avaya CM for further troubleshooting
48
Find the actual path taken by the voice call
Voice Quality Issues Tracking
Challenge Encountered:• User wishes to know the
voice quality issues between certain phones
Solution:• Between the extensions
facing the voice quality issues, launch the Voice path
• For the same voice path, one can launch the NNMi path health report as well
49
Between two sites of different QoS/CODED requirements
Voice Quality Degraded
Challenge Encountered:• User wants to troubleshoot the voice
quality issues like calls not reaching certain sites or getting breakage in voice or getting abruptly disconnected
Analysis:• Connection between two network
regions (QoS-enabled) is Failed/Down
Solution:• ‘Network region connection in Failed
state’ incident gets generated• Drill down to the network region
connection and find out participating regions in the connection
• Open the ‘Source object’ network region detail form to understand the issues and troubleshoot them – issues like DSP MEdPro resources over-utilized
50
Track the Media processor resources
Calls That Require Codec Translations Don’t Go Through
Challenge Encountered:• Certain calls that come through PSTN
can not carried to the end users
Analysis:• Avaya media gateway (Port network)
Media Processor moves to ‘Unknown’ state
Solution:• ‘Avaya MedPro in Unknown state
incident is generated• Drill down to the incident and go to
MedPro detail form to look at the port network it is part of and the network region it belongs to
• MedPro Control Link and Ethernet Link state along with the DSP channel usage state can also be observed here
• DSP chaneel usage state can help in capacity planning as well
51
Track the DSP MedPro resources
Call Carried via Gateway Face Voice Quality Issues
Challenge Encountered:• Certain calls that are routed via
gateways face voice quality degradation
• Voice gets broken in between the call
Analysis:• Avaya IP MedPro DSP stats breaching
the configured Codec/DSP Usage thresholds
Solution:• Avaya IP MedPro DSP Codec stats
threshold violation incident gets generated
• Drill down the incident to actually look at the measurement values for Codecslike G.723/G.729 and DSP Usage for a particular network region
• Useful in capacity planning and optimization of Codec/DSP resources for the particular Network region
52
‘Busy tone’ / ‘Recorder tone’ / ‘network-not-available tone’
Certain Calls Not Getting Processed
Challenge Encountered:• Certain calls that are processed through
a particular CM are always getting ‘busy tone’
Analysis:• Avaya media gateway (Port network) IP
Server Interface is ‘Out-of-service’
Solution:• Avaya Media Gatway ‘IPSI Service
state is Out’ incident gets generated• Drill down to the incident and browse
to the IPSI detail form to know the Port Network/G650 on which the Service is running
• Is it IPSI A (primary) or IPSI B (secondary)?
• Which call controller it connects the PN/G650 to ?
• What is the ‘State of health’ for this IPSI?
53
Route pattern queue usage monitoring
Call Toll Free: 18001208080…Can’t Reach?
Challenge Encountered:• Calling the 1800 toll free numbers do
not result in a call
Analysis:• Route pattern that carries this toll free
call, is facing queue overflow and hence the call is getting dropped
Solution:• Incident for ‘Queue overflow for the
route pattern’ gets generated• Drill down to the Route Pattern from the
incident and look at the threshold violations and Queue stats
• Look at the Trunk groups for the Route pattern and thei usage details
• For a particular Trunk Group, look at the member service state and the overall Trunk group ATB% details
• This helps in knowing whether the Trunk Groups are under/over utilized
54
Signaling group service state monitoring
Calling the Branch Office…Can’t Reach ?
Challenge Encountered:• Calling the branch office results in
calls not going through
Analysis:• FAS Signaling group goes out-of-
service not resulting in any successful calls
Solution:• Signaling group ‘out-of-service’
incident gets generated• Drill down to the Signaling Group
detail form and see if it is FAS (Facility associated Signaling) and also check the associated Trunk group member state
• These values help in allocating more signaling resources if required
55
H.248 media gateways registration monitoring
Cannot Call Anyone Outside of Office?
Challenge Encountered:• Calls from office to out side of
office are not possible
Analysis:• H.248 Media Gateway unregisters
from the call server• Active faults in VoIP engine
Solution:• Gateway Unregistered incident
gets generated• Drill down to Gateway detail form
and look at the H.248 Link State and Registration faults if any.
• Also, Look at the VoIP engines for the H.248 MGw and look out for the ‘fault’ state
• This enables the user to know the reason why such calls are not carried
56
Capacity planning and optimization for gateway resources
Huge Bills from the Provider
Challenge Encountered:• My VoIP/Media module resources can not keep up to the usage
Analysis:• VoIP engine Hyper activity/5-mins
avg occupancy going high• Faults are active
Solution:• VoIP Engine ‘Fault Active’ incident
gets ge nerated• Drill down to VoIP engine detail
form to look at Hyperactivity and 5-min avg occupancy stats and the fault values
• Look at the corresponding DSP core to see their usage as well
• This helps in optimization of available gateway resources and capacity planning
57
Local survivability tracking
Cannot Call up the Head Office?
Challenge Encountered:• User wants to know the reason why
branch office people are not able to call up the central office
Analysis:• LSP server seems to be active due
to IP WAN link failure between central office and branch office
Solution:• LSP active incident is generated • Drill down to this Critical incident
and look at the extensions that are registered to the LSP
• This shows the impact of the failure
58
Top-N call report
Analyze Termination Reasons for the Calls
Challenge Encountered:• User wants to analyze the Top-
10 termination/failure reason for phone calls in last one day
Solution:• Launch the Avaya CDR Top-N
report and generate the reports Grouped by Calling party number, Called party number, Call manager IP, Termination reason
59
Trend analysis of call carrying issues
Analyze Calls Through Certain Trunk Groups
Challenge Encountered:• User wants to analyze the issues
in carrying the call via a certain Trunk group for last one day
Solution:• Launch the Avaya CDR Top-N
report and generate the reports Grouped by Calling party number, Called party number, Call manager IP, Termination reason, Outbound Trunk Group name
• For a designated Trunk group, topology filter can also be set
60
Heat chart report
Analyze the Overall Heat in the VoIP Network
Challenge Encountered:• User wants to analyze the number
of calls and their duration for a period of one week
Solution:• Launch the Avaya CDR Heat chart
report to see the behavior of overall VoIP network
61
Chart details report
Analyze Call Patterns for a Month
Challenge Encountered:• User want to analyze the voice call
patterns over a month’s period
Solution:• Launch the Avaya CDR Chart
report
62
RTT threshold monitoring
Slow Network Connectivity Between Two Sites
Challenge Encountered:• Network traffic too slow between two
sites
Analysis:• Network Latency (for UDP services)
seems very high between the two sites
Solution:• Incident for ‘High Round Trip Time
threshold violation gets generated• Drill down to the ‘QA Probe’ from this
incident showing exact measurement value violating the configured threshold
• Look at the Threshold state tab to know threshold configuration details
• Look at the jitter configuration for VoIP and UDP-Jitter tests
• Overall, probe status changes to Major
63
MOS threshold monitoring
Find Voice Quality Issues Between Sites
Challenge Encountered:• User wants to know the reason for
voice quality issues between certain sites
Analysis:• MOS (Mean Opinion Score) value
across these sites are lower than expected
Solution:• Incident for ‘Low MOS generated’ shows the measurement time
• Drill down the incident and going to the Source object probe, threshold state indicates the state change
• Corresponding probe status ‘Major’• One can look at the codecs used for Voice as well to further troubleshoot
64
ICMP Echo tests
Detect Network Reachability Issues
Challenge Encountered:• Network reachability is too
slow/broken across the sites
Analysis:• Latency reported in ICMP Echo
probe/test measurement is quite high
Solution:• Incident for ‘High RTT’ shows the
measurement time and value for RTT threshold violation
• Drill down the incident and going to the Source object probe, threshold state indicates the state change
• Corresponding probe status ‘Major’
65
Like SQL, Exchange and Web services etc
Degradation in Performance of Services
Challenge Encountered:• Degradation in performance
while connecting to SQL servers or the Exchange Server
Analysis:• High latency reported in TCP-
connect test/probe across the sites
Solution:• Incident for ‘High RTT’ shows the
measurement time and value for RTT threshold violation
• Drill down the incident and going to the Source object probe, threshold state indicates the state change
• Corresponding probe status ‘Major’
66
IP SLA test state monitoring
QA probe/IP SLA Test Run Failed
Challenge Encountered:• User wants to know if a particular
test run failed and why?
Analysis:• Probe/test can fail if the oper state
of the test is set to ‘NotConnected’, ‘Disabled’ or ‘TimeOut’.
Solution:• Incident for ‘Test failed’ or ‘Test
returned error’ generated with the reason for it
• Overall test/probe status goes to ‘Critical’
67
Chart details for threshold exceptions
Analyze a Test-specific Trend
Challenge Encountered:• User wants to know hourly trend
for RTT/Jitter for a particular test/probe
Solution:• Select a test/probe from the
inventory• Look at the Chart reports for last
one hour to see ‘RTT’ Vs ‘One way +ve Jitter’
68
Top-N report per metric
Analyze a Trend in Jitter/PacketLoss/latency
Challenge Encountered:• User wants to know the trend of
how overall latency/jitter/packet loss over last 2 days
Solution:• Launch the report for Top-N probes
for a particular RTT/Jitter/PL metrics for last day
69
MOS-based chart reports
Voice Quality Issues Trend
Challenge Encountered:• User wants to know the trend of
overall Voice quality issues over last 2 days
Solution:• Launch the report for Top-N probes
for a particular MOS metrics for last day
70
Heat chart report
Overall Heat of Network Availability
Challenge Encountered:• User wants to know the trend of
overall network reachabilityissues
Solution:• Launch the Heat chart for
RTT/Jitter/PL exceptions
71
Tree launch for a particular stream / flow
Geographies from Which Subscribers Are Registered
Challenge Encountered:• User wishes to see the sites where
subscribers are registered for the steam
Solution:• Select a particular flow/steam• Launch the IP MC forwarding tree
for that stream
72
Webcast/IPTV channel video quality monitoring
Video Quality Degraded in a Specific Region
Challenge Encountered:• During the live Webcast, subscribers
in certain site is seeing degraded quality of Video
Analysis:• Multicast stream/flow rate seems degraded while going to a certain site
Solution:• Find out from ‘Flow inventory’ which stream passes through the site that has problems
• Launch the ‘Fowarding tree’ for this stream/flow
• Look at the ‘Red’ lines indicating the ‘flow rate drop’ of the video streams
• Mouse-over on the header message to see detailed information about the rate drop
73
Troubleshoot and fix
User in Sydney Having Issues in Voice/Video
Challenge Encountered:• For a particular Webcast/stream, Sydney region subscriber is facing video quality issues
Analysis:• The edge router multicasting the video from source seems to be receiving flow/stream at lower ingress rate than expected
Solution:• From the flow inventory, launch the Reverse path for the stream/flow on which Sydney user is facing issues
• Look at the ingress flow/stream rate for the edge router that multicasts the stream to Sydney region
• Look at this particular flow/stream’s Top-10 report to know which node is showing drop in the flow rate in last hour’s time
74
Top-N interface report
Analyze IP Multicast Interface Bandwidth Utilization
Challenge Encountered:• User wishes to analyze the Multicast
traffic utilization across the network for a period of time
Solution:• Launch the ‘Top-N’ report for IP
Multicast interfaces• Look at the ‘Utilization In (avg)’
metrics to know avg bandwidth utilization for Ingress traffic at different PIM interfaces in the network
• Helps in capacity planning of PIM interfaces
• Also, the ifAlias gives the user a quick understanding of link
75
Top-N flow report
Analyze the Video Quality Issues Trend
Challenge Encountered:• User wants to analyze the trend of
video quality issues in the network
Solution:• Launch the ‘Top-N’ IP Multicast flow
reports and see for which flows the data drop/bursts are observed
• Within the problematic flow, launch the TopN report for nodes on which the drop has been observed in Ingress traffic
76
Heat chart report
Analyze Overall Heat of the Multicast Network
Challenge Encountered:• User wishes to look at the
overall Multicast network health
Solution:• Launch the IP Multicast flow
‘Heat chart’ report to look at overall network health for a period of time