Application Monitoring Best Practices

Application Monitoring Best Practices

o Scripts

o Script objectives – Monitoring scripts need not be complex to achieve their goals. When a script

becomes too complex, the risk of script failures is greater. An effective script primarily needs to

answer the following questions.

o What is the performance of the application?

o What is the availability of the application?

Additionally, a successful monitoring script needs to be tuned to minimize alerts that are

not actual failures, therefore assuring that alerts sent to recipients are useful.

o Transactions – A transaction should be a test of the primary functionality

within the application. Transactions should measure performance time

and availability by validating an expected checkpoint. They always use

visual analysis as a checkpoint, not native objects.

o Timing transactions - It is important that a transaction

is placed in its own group, directly after the click that

initiates the measured action.

If an additional step

occurs between the click

and the transaction

checkpoint, the timers

will not be accurate and

may likely show a value

of zero. Users can think

of the transaction as a

stopwatch and the

checkpoint stops the watch when it succeeds.

Part 1: Basic Monitoring Tenets

o Proper Script Flow

Script Stability Achieved by Checkpoints – Whether or not a checkpoint is used as a

transaction, it is crucial to perform text checkpoints throughout the script to be sure

the actions are unfolding as they should. Anytime that the screen changes, a

checkpoint should be used to be sure the device is ready for the next action.

Timeouts can be set on these checkpoints to values appropriate to the specific

action. Wait commands should not be used for this purpose as a wait command

does not check the status of the screen.

Unique Text Checkpoints - Checkpoints should consist of new text so they can prove

that the screen has changed. In other words, the user should sync text that is not

shown on the previous screen.

Visual vs Native Objects - Checkpoints should always be completed using visual

analysis, not native objects. The idea of a checkpoint is to assess what is visible on a

screen. Native objects are not always visible and the presence of a native object

does not guarantee that the page has correctly rendered. Non-checkpoint actions,

such as page navigation, should employ native objects when possible as this

improves script stability.

o Script reusability across devices

User Functions – User functions allow the scripter to write many subscripts that are

compatible with various devices. This allows the user to simply call a user function

from a parent script. Ultimately, this approach allows a single script to be used for

many devices which means that maintenance is easier and scripts look cleaner.

Text Checkpoints are Common – Script reusability works best when leveraging text

checkpoint which tend to be common across devices and operating systems.

o Devices

Redundant Pairs – A device that runs 24 hours a day, constantly performing

application transactions, sending and receiving communications, and playing video,

is likely to accumulate an unwieldy cache and consume memory. This can Impact

device readiness to execute the monitoring script and lead to failures.

Also, a script can be interrupted by incoming SMS, OS update available, amber alerts

and other similar elements on a specific device.

A proven way to handle these scenarios and reduce false alerts is to provide device

redundancy. Perfecto monitoring uses pairs of devices, set up identically, and a

script that automatically tries again upon failure.

Device pairs are defined using the Description field. The monitoring tool will call the

device using that field and the monitoring system script will run the script on either

one or both of the devices as needed.

Self-healing monitoring environment - Completing regular power cycles on devices

will help devices to be more stable. It is a good practice to have a script that runs

once a day to restart devices and their cradles. Many application failures will be

related to a device that needs to be refreshed rather than an application that is

malfunctioning. Therefore it is important to dynamically reboot the device when an

application fails to load. Perfecto allows us to create subscripts. A maintenance

subscript that automatically restarts the device and its cradle can be added to a

monitoring script.

A conditional statement will tell the script when the device has failed and

automatically run the maintenance script.

A maintenance script should

be set to ‘Async’ so that the

monitoring script will continue

while the maintenance script

runs.

Immediately after the maintenance script is called the monitoring script should exit

with the status ‘error’. This way the next time a script calls that device the device

has been restarted.

Handling Common Errors – It is a good practice to include subscripts that will handle

common device, application or OS related issues. A great example of this is popups

that can cause scripts to fail or turning off Wi-Fi if the test is meant for wireless data

testing.

o Scheduling

When monitoring an application, it is best to run that application as many times as

possible thus giving the script more chances to detect a failure. However, since we

know that overstressing devices may cause false alarms, we want to allow the script

enough time to run on two devices.

Ideally a script should run in less than 4 minutes. Some scripts can take up to five or

even six minutes to run, depending on the complexity of the script.

HP BSM and the VuGen templates that Perfecto uses allow us to run scripts

concurrently. This allows us greater scheduling flexibility, allowing for “rest time” on

the devices before the next script runs. There is also less of a risk that a device will

still be in use when the next script is scheduled to run.

AlertSite does not currently support running devices concurrently so the following

type of schedule is recommended.

In the case of AlertSite, where scripts need to be run one after the other, it is best to

allow 15 minutes per scheduled run. A good monitoring framework should always

employ a well-maintained script device schedule.

Run script on primary device Up to 6 minutes

Run script on redundant device and run maintenance script on the first device Up to 6 minutes

Allow time for monitoring tool overhead Up to 3 minutes

Total 15 minutes

2. Types of Errors

Device – Device related errors occur because of a problem with the device itself.

Perhaps the device has crashed and is in need of a power cycle and recover. Perhaps

there is a persistent popup that needs to be cleared outside of the script. These

errors need to be triaged and corrected as quickly as possible. Most can be

corrected through the cloud interface, but some require hands-on attention at the

data center.

Monitoring – These are errors related to the monitoring software or scheduling and

also should be remedied as quickly as possible. If a device cannot be called by the

script because it is already in use, the scheduling should be checked. If there is a

technical issue with the monitoring software, a case should be opened immediately

with Perfecto Support.

Real Errors – The goal of monitoring is finding issues related to a failure of the

service being tested. These should be tracked and reported immediately.

This tutorial will use a script that tests the Verizon Wireless Indycar application

Transactions

Check Indycar Started

Social Open

Tweets Loaded

The Script:

1. Prepare device

a. Open Device – This function is found in the devices category. It can be set for any device

assigned to a device variable.

b. Prepare Device –

a. Open device allocates the device to the user

b. Home command ensures that the device is on the

home screen.

2. Prepare applications

a. Close Application – Make certain that the application is closed at the beginning of the script

run to ensure consistent results when the application is launched.

b. Launch Application - Use the native Start Application function. In order for some

applications to function as expected, it is best to first completely close the application on

the device. In the Start

Application function

parameters, enable the

timeout and set it to 0. This

will force the script to move

directly to the checkpoint and

more accurately measure the

time to open the application.

Part 2: Writing a Monitoring Script

3. Transaction 1 – Check Indycar Started

Groups can be defined as transactions in the Parameters tab,

accessed by double-clicking the group header.

a. Text Checkpoint – Use a visual text

checkpoint to ensure that the

application has properly opened. Be

sure that the text you search is not

displayed on the device home screen

as this may result in a false positive.

b. Similarly, do not choose text that appears on the application splash screen

since apps can get “stuck” on a splash screen. The application name is not a

good text checkpoint for these reasons.

c. Maintenance Script on Failure – The Start application checkpoint should

always be followed by a conditional statement. If the checkpoint fails, the

script should move directly to a subscript that reboots and recovers the

device.

d. This maintenance script should be run in

the “Async” mode so that the script will

proceed to the next step while the device

restarts. That next step should be an exit

function with status set to Error. By

following this protocol, the device which

failed to start the application will be in a

better state when the next script is

executed.

4. Perform test actions

a. Whatever actions are needed to complete the measured transaction should

happen at this time. This script actions should be completed using native

objects via XPath when possible. Performance using XPath is far greater than

performance using visual objects.

b. In the case of the

Indycar-Social

script, the test

actions consist of navigating to the “Social” page so it is possible to test the

appearance of page elements. For this, we employ a user function,

NavigateToSocial. The steps within the subscript are listed below.

c. Now that we have accessed the Social page in the Indycar app, we can

validate 2 checkpoints.

5. Transaction 2 –Social Open, and Transaction 3, Tweets Loaded

a. As mentioned earlier, Checkpoints for transactions within a monitoring script

should be done via visual analysis of elements that can be seen on the screen.

The following 2 checkpoints will determine 2 validation points. Firstly, it will

verify that we have opened the Social page, and secondly, it will verify that

the tweets are loading.

b. The only element to be found on the page that will appear with or without a

data connection is the twitter icon.

c. Once we have seen that, we can look for the word, “Retweets” which will only

show if a tweet has been loaded on the page. Also, it shows no matter the

content of that tweet.

d. Grid label – In the checkpoint configuration, add a grid label with a meaningful

name so that your script report will display errors on the grid tab. This makes

error triage fast and easy.

6. End

a. Close apps – Again, use the Close application function when possible.

b. Home

c. Close device - Always close the device at the end of each script to avoid device

allocation errors.

Documents

Application Monitoring Best Practices