38
Proactive monitoring with Monit Developer Toolbox Series Rafael Luque (OSOCO) September 2015

Proactive monitoring with Monit

  • Upload
    osoco

  • View
    1.860

  • Download
    4

Embed Size (px)

Citation preview

Proactive monitoring with Monit Developer Toolbox Series Rafael Luque (OSOCO)

September 2015

Barking at daemonsAn small open source utility to monitor Unix systems with automatic error recovery capabilities.

What Monit can monitor

Files, Dirs and Filesystems

Monitor these items for changes, such as timestamps changes, checksum changes or size changes.

Hosts

Monitor network connections to various servers, either on localhost or on remote hosts. TCP, UDP and Unix Domain Sockets are supported. Network tests can be performed on a protocol level.

System

General system resources on localhost such as overall CPU usage, Memory and Load Average.

Processes

Daemon processes or similar programs running on localhost, such as those started at system boot time from /etc/init.d/

Programs and scripts

Test programs or scripts at certain times, much like cron, but in addition, you can test the exit value of a program and perform an action or send an alert if the exit value indicates an error.

Global configuration1

Configuration (i)

◉ Global configuration file at /etc/monitrc.◉ Sample global configuration:

○ Check services at 30 seconds intervals:

set daemon 30

# with start delay 240 # optional: delay the first check by 4-minutes (by

# # default Monit check immediately after Monit start)

Configuration (ii)

◉ Set Monit’s logfile:

◉ Mail configuration:

set logfile /var/log/monit.log

set mailserver localhost

# By default Monit will drop alert events if no mail servers are available.

# If you want to keep the alerts for later delivery retry, you can use the

# EVENTQUEUE statement.

set eventqueue

basedir /var/monit # set the base directory where events will be stored

slots 100 # optionally limit the queue size

Configuration (iii)

## Alert email recipient:

set alert [email protected]

## Alert email format:

set mail-format {

from: monit@$HOST

subject: monit alert -- $EVENT $SERVICE

message: $EVENT Service $SERVICE

Date: $DATE

Action: $ACTION

Host: $HOST

Description: $DESCRIPTION

Your faithful employee,

Monit

}

Configuration (iv)

◉ HTTP interface:

◉ Additional configuration files:

set httpd port 2812 and

allow admin:monit # require user 'admin' with password 'monit'

include /etc/monit.d/*

Basic usage2

Basic commands (i)

Controlled from command line with the command monit:◉ Start Monit daemon: $ monit◉ Exit Monit: $ monit quit◉ Status summary: $ monit summary◉ Disable monitoring of a named service or all services:

$ monit unmonitor name

$ monit unmonitor all

◉ Enable monitoring:

$ monit monitor name

$ monit monitor all

Basic commands (ii)

◉ Start named service or all services:

$ monit start name

$ monit start all

◉ Stop named service or all services:

$ monit stop name

$ monit stop all

◉ Restart named service or all services:

$ monit restart name

$ monit restart all

Monitoring examples3

Simple process monitoring

check process tomcat-8 with pidfile /var/run/tomcat-8.pid

Proactive process monitoring

check process tomcat-8 with pidfile /var/run/tomcat-8.pid

start program = “/etc/init.d/tomcat-8 start”

stop program = “/etc/init.d/tomcat-8 stop”

Restart process if it has stopped accepting connections

check process tomcat-8 with pidfile /var/run/tomcat-8.pid

start program = “/etc/init.d/tomcat-8 start”

stop program = “/etc/init.d/tomcat-8 stop”

restart program = “/etc/init.d/tomcat-8 restart”

if failed port 8080 protocol http then restart

Restart process if it has stopped accepting connections avoiding false positives

check process tomcat-8 with pidfile /var/run/tomcat-8.pid

start program = “/etc/init.d/tomcat-8 start”

stop program = “/etc/init.d/tomcat-8 stop”

restart program = “/etc/init.d/tomcat-8 restart”

if failed port 8080 protocol http for 2 cycles then restart

Check process response to requests

check process apache with pidfile /usr/local/apache/logs/httpd.pid

start program = "/etc/init.d/httpd start"

stop program = "/etc/init.d/httpd stop"

if failed host www.tildeslash.com port 80 protocol http

and request "/somefile.html"

then restart

if failed port 443 type tcpssl protocol http

with timeout 15 seconds

then restart

Avoid noisy alarms

check process apache with pidfile /usr/local/apache/logs/httpd.pid

start program = "/etc/init.d/httpd start"

stop program = "/etc/init.d/httpd stop"

if failed host www.tildeslash.com port 80 protocol http

and request "/somefile.html"

then restart

if failed port 443 type tcpssl protocol http

with timeout 15 seconds

then restart

if 3 restarts within 5 cycles then unmonitor

Check resources used by process (e.g. DoS attacks)

check process apache with pidfile /usr/local/apache/logs/httpd.pid

start program = "/etc/init.d/httpd start" with timeout 60 seconds

stop program = "/etc/init.d/httpd stop"

if cpu > 60% for 2 cycles then alert

if cpu > 80% for 5 cycles then restart

if totalmem > 200.0 MB for 5 cycles then restart

if children > 250 then restart

if loadavg(5min) greater than 10 for 8 cycles then stop

if failed host www.tildeslash.com port 80 protocol http

and request "/somefile.html"

then restart

if failed port 443 type tcpssl protocol http

with timeout 15 seconds

then restart

if 3 restarts within 5 cycles then unmonitor

Monitor filesystem space and inode usage

check filesystem datafs with path /dev/sdb1

start program = "/bin/mount /data"

stop program = "/bin/umount /data"

if space usage > 80% for 5 times within 15 cycles then alert

if space usage > 99% then stop

if inode usage > 30000 then alert

if inode usage > 99% then stop

Monitor file checksum (e.g. rootkits)

check file apache with path /usr/sbin/httpd

if failed checksum then alert

if failed uid root then alert

if failed gid root then alert

if failed permission 755 then alert

Monitor a directory that should change

check directory incomming with path /var/data/ftp

if timestamp > 1 hour then alert

Check network interface status

check network eth0 with interface eth0

start program = '/etc/init.d/net.eth0 start'

stop program = '/etc/init.d/net.eth0 stop'

if failed link then restart

Check network link capacity changes

check network eth0 with interface eth0

if changed link capacity then alert

Check network link usage (saturation, bandwidth)

check network eth0 with interface eth0

if saturation > 90% then alert

if upload > 500 kB/s then alert

if total download > 1 GB in last 2 hours then alert

if total download > 10 GB in last day then alert

Check remote host availability by issuing a ping test

check host osoco.es with address osoco.es

if failed ping then alert

Check the content of a response from a web server

check host myserver with address 192.168.1.1

if failed port 80 protocol http

and request /some/path with content = "a string"

then alert

Check connection with custom protocol (MySQL)

check host databaserver with address 192.168.1.1

if failed ping then alert

if failed

port 3306

protocol mysql username foo password bar

then alert

Check custom program status output

check program myscript with path /usr/local/bin/myscript.sh

if status != 0 then alert

Check custom program every workday at 8AM

check program checkOracleDatabase

with path /var/monit/programs/checkoracle.pl

every "* 8 * * 1-5"

Check service dependencies before start/stop/monitor/unmonitor

check process apache

with pidfile "/usr/local/apache/logs/httpd.pid"

...

depends on httpd

check file httpd with path /usr/local/apache/bin/httpd

if failed checksum then unmonitor

Hierarchy of dependenciescheck process apache

...

depends on tomcat

check process tomcat

...

depends on mysql

check process mysql

...

depends on datafs

check filesystem datafs with path /dev/sdb1

start program = "/bin/mount /data"

stop program = "/bin/umount /data"

Web interface4

Monit web interface

One interface to rule them all

◉ M/Monit: ○ Monitoring and

management of all your Monit hosts.

○ Also works on mobile devices.

○ A one-time payment

and the license is perpetual.

One interface to rule them all

◉ Monittr: ○ https://github.com/karmi/monittr○ Free and very basic option.

Demo time