30
Passive Monitoring with Nagios Jim Prins [email protected]

Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

  • Upload
    nagios

  • View
    97

  • Download
    1

Embed Size (px)

DESCRIPTION

Jim Prins's presentation on Passive Monitoring with Nagios. The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference

Citation preview

Page 1: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Passive Monitoring with NagiosJim Prins

jprins1229gmailcom

Introduction

bull Sr Manager ndash Web Technologies Harman International

bull Web Application amp Server Monitoringbull 180 Hostsbull 1100+ Servicesbull Goal All Green Lights

Agenda ndash pt 1

bull Active vs Passive Checks

bull Enabling Passive Checking in Nagiosbull Enabling on Nagios Core amp Nagios XIbull Configuring NRDP Server amp Client

bull Customizing Passive Checksbull Volatilitybull State Stalkingbull Freshness Checking

Agenda ndash pt 2

bull Example 1 ndash Airline Call Button

bull Example 2 ndash Backup Monitoring

bull Other Passive Examples

bull Summary

bull QuestionsAnswers

Active vs Passive Checks

bull Active Checksbull Active from perspective of the Nagios applicationbull Request initiated by the serverbull Server authenticated by the clientbull Client decides whether to respond

bull Passive Checksbull Passive from perspective of the Nagios

applicationbull Request initiated by the clientbull Client authenticated by the serverbull Server decides whether to accept message

Use cases

Good reasons for passive checks

bull Detect and respond each time event happensbull Passive Check w Volatility amp State Stalking

bull Detect and respond when something has stopped happeningbull Passive Check w Freshness

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 2: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Introduction

bull Sr Manager ndash Web Technologies Harman International

bull Web Application amp Server Monitoringbull 180 Hostsbull 1100+ Servicesbull Goal All Green Lights

Agenda ndash pt 1

bull Active vs Passive Checks

bull Enabling Passive Checking in Nagiosbull Enabling on Nagios Core amp Nagios XIbull Configuring NRDP Server amp Client

bull Customizing Passive Checksbull Volatilitybull State Stalkingbull Freshness Checking

Agenda ndash pt 2

bull Example 1 ndash Airline Call Button

bull Example 2 ndash Backup Monitoring

bull Other Passive Examples

bull Summary

bull QuestionsAnswers

Active vs Passive Checks

bull Active Checksbull Active from perspective of the Nagios applicationbull Request initiated by the serverbull Server authenticated by the clientbull Client decides whether to respond

bull Passive Checksbull Passive from perspective of the Nagios

applicationbull Request initiated by the clientbull Client authenticated by the serverbull Server decides whether to accept message

Use cases

Good reasons for passive checks

bull Detect and respond each time event happensbull Passive Check w Volatility amp State Stalking

bull Detect and respond when something has stopped happeningbull Passive Check w Freshness

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 3: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Agenda ndash pt 1

bull Active vs Passive Checks

bull Enabling Passive Checking in Nagiosbull Enabling on Nagios Core amp Nagios XIbull Configuring NRDP Server amp Client

bull Customizing Passive Checksbull Volatilitybull State Stalkingbull Freshness Checking

Agenda ndash pt 2

bull Example 1 ndash Airline Call Button

bull Example 2 ndash Backup Monitoring

bull Other Passive Examples

bull Summary

bull QuestionsAnswers

Active vs Passive Checks

bull Active Checksbull Active from perspective of the Nagios applicationbull Request initiated by the serverbull Server authenticated by the clientbull Client decides whether to respond

bull Passive Checksbull Passive from perspective of the Nagios

applicationbull Request initiated by the clientbull Client authenticated by the serverbull Server decides whether to accept message

Use cases

Good reasons for passive checks

bull Detect and respond each time event happensbull Passive Check w Volatility amp State Stalking

bull Detect and respond when something has stopped happeningbull Passive Check w Freshness

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 4: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Agenda ndash pt 2

bull Example 1 ndash Airline Call Button

bull Example 2 ndash Backup Monitoring

bull Other Passive Examples

bull Summary

bull QuestionsAnswers

Active vs Passive Checks

bull Active Checksbull Active from perspective of the Nagios applicationbull Request initiated by the serverbull Server authenticated by the clientbull Client decides whether to respond

bull Passive Checksbull Passive from perspective of the Nagios

applicationbull Request initiated by the clientbull Client authenticated by the serverbull Server decides whether to accept message

Use cases

Good reasons for passive checks

bull Detect and respond each time event happensbull Passive Check w Volatility amp State Stalking

bull Detect and respond when something has stopped happeningbull Passive Check w Freshness

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 5: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Active vs Passive Checks

bull Active Checksbull Active from perspective of the Nagios applicationbull Request initiated by the serverbull Server authenticated by the clientbull Client decides whether to respond

bull Passive Checksbull Passive from perspective of the Nagios

applicationbull Request initiated by the clientbull Client authenticated by the serverbull Server decides whether to accept message

Use cases

Good reasons for passive checks

bull Detect and respond each time event happensbull Passive Check w Volatility amp State Stalking

bull Detect and respond when something has stopped happeningbull Passive Check w Freshness

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 6: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Use cases

Good reasons for passive checks

bull Detect and respond each time event happensbull Passive Check w Volatility amp State Stalking

bull Detect and respond when something has stopped happeningbull Passive Check w Freshness

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 7: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Enabling Passive Checks

Passive host and service checks are enabled in Nagios Core via configcfg

Default location usrlocalnagiosetcnagioscfg

accept_passive_host_checks=1accept_passive_service_checks=1

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 8: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

NRDP ndash Server Side

bull Nagios Remote Data Processor (NRDP)

bull Server Usually runs on the Nagios server at httpltip_addressgtnrdp

bull Tokens and other server side configuration maintained at usrlocalnrdpserverconfigincphp

bull $cfg[authorized_tokens]=array(0vn53mbj3lk4ldquo ldquo0vn53mbj3lk6rdquo)

Installation Guide and Overview httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 9: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

NRDP ndash Client Side

Client Installed into usrlocalnrdp

Sample Script ie backup_completesh

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 10: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Volatility ndash pt 1bull For non-volatile services error state is maintained during

each subsequent check until the symptom is resolved and the check returns OKbull 1000 Storage C Drive 98 Fullbull 1005 Storage C Drive 98 Fullbull 1010 Storage C Drive 98 Fullbull 1015 Storage C Drive 60 Full

bull A service is volatile if every alert indicates a unique issue and warrants a response event or notificationbull 1012 Security Heartbleed vulnerability scan from

759613212bull 1018 Security Port scan detected from 1959613212

Note A volatile service generally has no ldquogood newsrdquo response

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 11: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Volatility ndash pt 2

bull Volatility is enabled by setting is_volatile 1 in host or service configuration

bull Enabling volatility causes the following to happen in response to EACH non-OK alertbull Event Handler is Executed (if defined)bull Alerts are sent if appropriate

bull Note For volatile services notification intervals are ignored

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 12: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

State Stalking ndash pt 1

bull By default Nagios will log the output of a service check whenever the servicersquos STATUS changes

Time Status Message Logged

0955 OK Disk C 79 Full Not Logged

1000 WARNING Disk C 80 Full Logged

1005 WARNING Disk C 80 Full Not Logged

1010 OK Disk C 65 Full Logged

1015 OK Disk C 66 Full Not Logged

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 13: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

State Stalking ndash pt 2

bull With state stalking enabled Nagios will log the output of a service check whenever the servicersquos OUTPUT changes

Time Status Message Non-Volatile Volatile

0955 OK Disk C 79 Full Not Logged Logged

1000 WARNING Disk C 80 Full Logged Logged

1005 WARNING Disk C 80 Full Not Logged Not Logged

1010 OK Disk C 65 Full Logged Logged

1015 OK Disk C 66 Full Not Logged Logged

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 14: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

State Stalking ndash pt 3

bull Useful when monitoring Volatile services as each unique event is useful to record

bull 100002 ndash CRITICAL Port Scan from 751001231bull 100012 ndash CRITICAL Port Scan from 751001231bull 100403 ndash CRITICAL Heartbleed Vulnerability Scan from 751001231bull 131241 ndash CRITICAL SQL Injection Attempt on indexphp from 881241

Enabled by setting stalking_options directive for host or service scan httpnagiossourceforgenetdocs3_0stalkinghtml

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 15: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Freshness ndash pt 1

bull Monitoring passive checks for ldquofreshnessrdquo is a great way to determine when something has STOPPED happeningbull Ex Backup hasnrsquot checked in for the past 24 hours

(or 86400 seconds)

check_freshness 1freshness_threshold 86400

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 16: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Freshness ndash pt 2

When the freshness threshold (in seconds) is exceeded the check_command will be executed

check_command stale_criticalcheck_period 24x7

Note Only during the check period above will a service be checked for freshness

Command Definition (within usrlocalnagiosetccommandscfg)

define command command_name stale_critical command_line $USER1$check_dummy 2 Passive service has not checked in

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 17: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 1 ndash Airline Call Button

Requirement

Define call button status as service withability to toggle on and off using passivechecks

Solution

Call button ON should cause statusWARNINGCall button OFF should cause statusOK

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 18: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 1 ndash Airline Call Button

Step 1 Define Passive Service Check

define servicehost_name airplane1carriercomservice_description Call Button ndash 1Ais_volatile 1active_checks_enabled 0passive_checks_enabled 1hellipother optionshellip

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 19: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 1 ndash Airline Call Button

Step 2 Configure Script For Call Button Pressed

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=1output=ldquoWARNING ndash Call Button Pressed

$nrdp -u $url -t $token -H $host -s $service -S $state -o $outputldquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 20: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 1 ndash Airline Call Button

Step 3 Configure Script For Call Answered

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=airplane1carriercomservice=Call Button ndash 1Astate=0output=ldquoOK ndash Call Answered

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 21: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 2 ndash Backup Monitoring

Requirement

ndash DB backup should complete successfully at least 1 time per day Let someone know if it doesnrsquot

Solution

ndash Send passive acknowledgement upon successful backup completion

ndash Use freshness to alert us any time service has not checked in within 26 hours

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 22: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 2 ndash Backup Monitoring

Step 1 Define Passive Service Check

define servicehost_name backup-serverservice_description Oracle DB Backupactive_checks_enabled 0passive_checks_enabled 1check_freshness 1freshness_threshold 93600check_command no-backup-reporthellipother optionshellip

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 23: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 2 ndash Backup Monitoring

Step 2 Define Check Command

File usrlocalnagiosetccommandscfg

define commandcommand_name no-backup-reportcommand_line usrlocalnagioslibexeccheck_dummy 2 Results of

backup job were not reported

Note check_dummy does nothing but exit 2 (critical) and display the message in ldquoquotesrdquo

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 24: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Ex 2 ndash Backup Monitoring

Step 3 Configure Client to Send Acknowledgement

binbashnrdp=usrlocalnrdpclientssend_nrdpshurl=http1044469nrdptoken=0vn53mbj3lk4

host=backup-serverservice=Oracle DB Backupstate=0output=OK ndash Backup Completed Successfully

$nrdp -u $url -t $token -H $host -s $service -S $state -o $output

State Meaning

0 OK (GREEN)

1 WARNING (YELLOW)

2 CRITICAL (RED)

3 UNKNOWN (GREY)

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 25: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Other Passive Use Cases

bull Inaccessiblebull Device is behind a firewall and cannot be

reached by Nagios

bull Unpredictablebull Device is mobile and IP address changes

often

bull Scalabilitybull Aggregate multiple Nagios server statuses to

a central server (Distributed configuration)

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 26: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Conclusion

Passive Checksbull Supported in both Nagios Core and Nagios XIbull Initiated by the client authenticated and

validated by the serverbull Customizable with volatility state stalking and

freshness checkingbull Useful for detecting when events happen (ie

Security Alerts) as well as when events STOP happening (ie Backup Monitoring)

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 27: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Conclusion

NRDP ndash Nagios Remote Data Processor

bull Server Component Runs on Nagios Serverbull Collects passive updates from clients and submits

updates to Nagios Corebull Uses shared tokens for clientserver authentication

httpassetsnagioscomdownloadsnrdpdocsNRDP_Overviewpdf

NSCA can be used as an alternative especially for Windows clients

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 28: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Other Passive ExamplesFunction Volatile State Stalking Freshness Freshness

Threshold

ldquoLostrdquo Magic number entry

Disabled Disabled Enabled 108 Minutes

Team Member Status Reports

Enabled Disabled Enabled 1 Month

Security Event Enabled Enabled Disabled NA

Backup Success Disabled Disabled Enabled 26 Hours

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 29: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

Questions

Any questions

Thanks

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End
Page 30: Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios

The End

Jim Prins

jprins1229gmailcom

  • Passive Monitoring with Nagios
  • Introduction
  • Agenda ndash pt 1
  • Agenda ndash pt 2
  • Active vs Passive Checks
  • Use cases
  • Enabling Passive Checks
  • NRDP ndash Server Side
  • NRDP ndash Client Side
  • Volatility ndash pt 1
  • Volatility ndash pt 2
  • State Stalking ndash pt 1
  • State Stalking ndash pt 2
  • State Stalking ndash pt 3
  • Freshness ndash pt 1
  • Freshness ndash pt 2
  • Ex 1 ndash Airline Call Button
  • Ex 1 ndash Airline Call Button (2)
  • Ex 1 ndash Airline Call Button (3)
  • Ex 1 ndash Airline Call Button (4)
  • Ex 2 ndash Backup Monitoring
  • Ex 2 ndash Backup Monitoring (2)
  • Ex 2 ndash Backup Monitoring (3)
  • Ex 2 ndash Backup Monitoring (4)
  • Other Passive Use Cases
  • Conclusion
  • Conclusion (2)
  • Other Passive Examples
  • Questions
  • The End