32
GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213 Flávio Guilherme - 2007183058 DEI-FCTUC [email protected] - [email protected] DEI-FCTUC

GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Embed Size (px)

Citation preview

Page 1: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

GSR - TP3

Monitorização com NAGIOS e MRTG uti l izando SNMP e RMON

José Neto - 2001113213

Flávio Guilherme - 2007183058

DEI-FCTUC [email protected] - [email protected]

DEI-FCTUC

Page 2: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Configuração das Máquinas

Para a realização deste trabalho utilizámos três maquinas virtuais. Duas a correr o sistema operativo Fedora 13 e a correr

Windows XP.

Configurou-se um dos servidores Linux como estação de monitorização de alguns serviços equipamentos do Laboratório

G5.1 e Rede do DEI. Neste servidor começou-se por configurar o MRTG, e de seguida o NAGIOS.

Na realização deste trabalho foram também configurados um servidor Windows e um servidor Linux (Fedora 13) com o

objectivo de suportar SNMP. No cenário pedido foram também utilizados o Switch 3Com do laboratório e o gateway de

saída do DEI, que já suportam SNMP.

Por fim utilizou-se um serviço do DEI de envio de mensagens SMS a partir de emails.

Ferramentas de Monitorização: MRTG

A activação do MRTG foi executada exactamente como refere o manual da cadeira.

Após a execução destes comandos foi necessário instalar o SNMP no router.

Yum –y install net-snmp net-snmp-utils

Para activar o serviço SNMP no router gsrtp3.dei.uc.pt, utilizou-se o seguinte comando.

Service snmpd start

De seguida foi necessário criar uma pagina HTML de rosto que agrega os gráficos produzidos pelo MRTG para as várias

interfaces do router. Construção de um ficheiro de configuração para o MRTG.

Usr/local/mrtg-2/Bin/cfgmaker –global ‘WorkDir: /var/www/html/mrtg’ –global ‘Options[_]:bits,growright’ –output=/etc/mrtg.cfg [email protected]

Construção da uma pagina de rosto que agrega os gráficos produzidos pelo MRTG

/usr/local/mrtg-2/Bin/indexmaker –output=/var/www/html/mrtg/índex.html /etc/mrtg.cfg

Era pedido que o acesso às páginas do mrtg pedisse login e password, por isso tivemos de recorrer a sua configuração.

Para tal procedeu-se aos seguintes comandos.

DEI-FCTUC

Nagios 1

Page 3: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Make isntall-webconf /usr/Bin/isntall –c –m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios/nagios.conf

Yum install dbmmanage

Mkdir users

Htdbm –c /var/www/users/.dbmpasswd mrtgadmin

De seguida adicionou-se ao ficheiro httpd.conf as seguintes linhas:

<Location /mrtg> AuthType basic AuthName "mrtg" AuthBasicProvider dbm AuthDBMType SDBM AuthDBMUserFile /var/www/users/.dbmpasswd Require valid-user</Location>

Por fim, bastou aceder ao browser, confirmar o login com password escolhida no passo anterior, para se poder visualizar as

sete interfaces disponíveis no gtDEI.dei.uc.pt, como neste caso apenas nos interessa as interfaces 3 e 5, removemos o

código HTML das restantes interfaces.

Ferramentas de Monitorização: Nagios

A activação do NAGIOS foi executada exactamente como refere o quickstart alojado no seguinte endereço:

http://nagios.sourceforge.net/docs/3_0/quickstart-fedora.html

Apresentamos assim os comandos que considerámos mais importantes.

yum install php gcc glibc glibc-common gd gd-devel openssl-devel

useradd -m nagios

passwd nagios

groupadd nagcmd

usermod -a -G nagcmd nagios

usermod -a -G nagcmd apache

mkdir ~/downloads

cd ~/downloads/

wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.2.3.tar.gz

wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz

DEI-FCTUC

Nagios 2

Page 4: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

tar xzf nagios-3.2.3.tar.gz

cd nagios-3.2.3

./configure --with-command-group=nagcmd

make all

make install

make install-init

make install-config

make install-commandmode

De seguida foi necessário definir os contactos do administrador do nagios.

gedit /usr/local/nagios/etc/objects/contacts.cfg

make install-webconf

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

service httpd restart

No passo seguinte como no MRTG foi necessário estabelecer um login e uma password de acesso, neste caso para a

configuração do NAGIOS no apache.

make install-webconf

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

service httpd restart

Instalação dos plugins extras pedidos do NAGIOS.

tar xzf nagios-plugins-1.4.11.tar.gz

cd nagios-plugins-1.4.11

./configure --with-nagios-user=nagios --with-nagios-group=nagios

make

make install

chkconfig --add nagios

chkconfig nagios on

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

service nagios start

DEI-FCTUC

Nagios 3

Page 5: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Ferramentas de monitorização: MIB Browser

Não sendo só por si uma ferramenta de monitorização, foi um instrumento essencial na realização do trabalho.

Como dá para ver na figura anterior, é uma ferramenta extremamente útil para explorar as várias MIB’s utilizadas na

realização do trabalho. Mais à frente falaremos delas em mais detalhes, mas achámos que tinha uma interface mais intuitiva

que o snmpwalk. Além disso, também consegue emular a geração de traps, como se pode ver na seguinte figura:

DEI-FCTUC

Nagios 4

Page 6: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Configuração das máquinas: Linux

Começámos por instalar o snmp:

yum install snmp

Fomos ao ficheiro snmpd.conf e alterámos os seguintes parâmetros:

#traps

#avisar abaixo de 10% livre

includeAllDisks 10%

#avisar quando tem menos de 1 giga livre

disk / 1000000

#Dar acesso ao localhost e a rede interna

# sec.name source community

com2sec local localhost public

com2sec mynetwork 172.16.168.0/24 public

#Foi necessário criar dois grupos: para a localhost e para a rede

# groupName securityModel securityName

group MyRWGroup v1 local

group MyRWGroup v2c local

group MyRWGroup usm local

group MyROGroup v1 mynetwork

group MyROGroup v2c mynetwork

group MyROGroup usm mynetwork

#Demos a tudo se a ligação viesse da rede local

# group context sec.model sec.level prefix read write notif

access MyROGroup "" any noauth exact all none none

access MyRWGroup "" any noauth exact all all none

#criamos uma view “all” para posteriormente dar acesso aos grupos

## incl/excl subtree mask

view all included .1 80

#view rwview included system.sysLocation

syslocation Numa vm, provavelmente ali ao lado (a correr Fedora)

syscontact david <[email protected]>

DEI-FCTUC

Nagios 5

Page 7: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Como íamos monitorizar os serviços HTTP/HTTPS, IMAP/IMAPS, POP3/POP3S e SNMP, configurámos esta máquina como

apresentado nos trabalhos 1 de SSC e 2 de GSR, não entrando aqui em detalhe.

Configuração das máquinas: Windows

Para ser possível utilizar o protocolo SNMP para monitorizar uma máquina windows é necessário instalar o componente

Protocolo SNMP (Simple Network Management Protocol) que activará o Serviço SNMP.

Durante a configuração do nágios reparámos que alguns valores não estavam disponíveis na MIB standard (SNMPv2-MIB),

tendo portanto feito o download do SNMP Informant, que nos dá acesso a algumas informações que anteriormente não

estavam disponíveis, como se vê na seguinte figura:

DEI-FCTUC

Nagios 6

Page 8: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Configurações específ icas: snmptrapd

A única confiugração necessária para utilizar o snmptrapd é dizer-lhe quem tratará dos requests recebidos pela máquina,

neste caso, o snmptt. Isto é conseguido adicionando a seguinte linha ao ficheiro de configuração /etc/snmp/

snmptrapd.conf

traphandle default /usr/sbin/snmptthandler

Para utitlizar o snmptrapd como serviço, foi preciso adicionar a opção -On no ficheiro /etc/rc.d/init.d/snmptrapd

OPTIONS="-Lsd -p /var/run/snmptrapd.pid -On"

Configurações específ icas: snmptt

Começámos por instalar o snmptt, fazendo o download da versão 1.3. Os ficheiros vêm pré-compilados, tendo os três

executáveis (snmptt, snmptthandler e snmpttconvertmib) sido copiados para a pasta /usr/sbin

Para podermos utilizar o snmptt, tivemos que instalar várias bibliotecas de PERL que, por termos utilizado o MCPAN, já não

sabemos quais foram (mas enquanto não tivermos todas instaladas vamos tendo feedback a dizer o que precisamos).

Para utilizar estes ficheiros como serviço tivémos que realizar o seguinte comando:

cp snmptt-init.d /etc/rc.d/init.d/snmptt

Metemos também para a pasta /etc/snmp o ficheiro snmptt.ini, editando a seguinte linha:

# Set to either 'standalone' or 'daemon'

# standalone: snmptt called from snmptrapd.conf

# daemon: snmptrapd.conf calls snmptthandler

# Ignored by Windows. See documentation

mode = daemon

DEI-FCTUC

Nagios 7

Page 9: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Continuámos utilizando o snmpttconvertmib para ir buscar todas as traps existentes nas mibs que prentendíamos usar:

snmpttconvertmib --in=/usr/share/snmp/mibs/SNMPv2-MIB.txt --out=/etc/snmp/snmptt.conf

snmpttconvertmib --in=/usr/share/snmp/mibs/UCD-SNMP-MIB.txt --out=/etc/snmp/snmptt.conf

Como resultado, obtivémos um ficheiro com as várias traps que podereriam aparecer no seguinte formato:

MIB: SNMPv2-MIB (file:/usr/share/snmp/mibs/SNMPv2-MIB.txt) converted on Mon Dec 19 17:03:00 2011 using snmpttconvertmib v1.3

#

#

#

EVENT coldStart .1.3.6.1.6.3.1.1.5.1 "Status Events" Normal

FORMAT A coldStart trap signifies that the SNMP entity, $*

SDESC

A coldStart trap signifies that the SNMP entity,

supporting a notification originator application, is

reinitializing itself and that its configuration may

have been altered.

Variables:

EDESC

EXEC /usr/local/nagios/libexec/eventhandlers/submit_check_result $r TRAP 2 "Problem at $H!! coldStart"

A parte “EXEC” foi adicionada por nós e é um comando que fará o nagios reconhecer um problema (atravez do ficheiro

nagios.cmd) que foi gerada uma trap nalgum dos servidores a ser monitorizado. Neste caso, metemos com o nível 2

(Critical) e com o nome do host e a trap gerada.

Tal como este excerto de código foi gerada uma uma acção diferente para cada trap.

define service{

host_name 3com

service_description TRAP

is_volatile 1

check_command check-host-alive

max_check_attempts 1

normal_check_interval 1

retry_check_interval 1

active_checks_enabled 0

passive_checks_enabled 1

check_period 24x7

notification_interval 1

notification_period 24x7

DEI-FCTUC

Nagios 8

Page 10: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

notification_options u,c,w

notifications_enabled 1

contact_groups admins

}

No exemplo atenriormente desctrito temos o código de configuração do nagios referente ao switch 3com.

Embora no enunciado pedisse para só monitorizarmos traps do switch 3com, não era fácil metermos o switch a gerar traps,

tendo por isso obtado por replicar o código para a máquina windows (resultados apresentados na secção de testes mais à

frente).

Configurações específ icas: Nagios

Templates:

A maior parte das configurações do nagios herdam de um destes templates.

###############################################################################

###############################################################################

#

# CONTACT TEMPLATES

#

###############################################################################

###############################################################################

# Generic contact definition template - This is NOT a real contact, just a template!

define contact{

name generic-contact ; The name of this contact template

service_notification_period 24x7 ; service notifications can be sent anytime

host_notification_period 24x7 ; host notifications can be sent anytime

service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events

host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events

service_notification_commands notify-service-by-email ; send service notifications via email

host_notification_commands notify-host-by-email ; send host notifications via email

register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!

}

DEI-FCTUC

Nagios 9

Page 11: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

define contact{

name generic-sms ; The name of this contact template

service_notification_period 24x7 ; service notifications can be sent anytime

host_notification_period 24x7 ; host notifications can be sent anytime

host_notification_options d,u,r ; send notifications for all host states, flapping events, and scheduled downtime events

service_notification_commands notify-service-by-email ; send service notifications via email

host_notification_commands notify-host-by-sms ; send host notifications via email

register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!

}

###############################################################################

###############################################################################

#

# HOST TEMPLATES

#

###############################################################################

###############################################################################

# Generic host definition template - This is NOT a real host, just a template!

define host{

name generic-host ; The name of this host template

notifications_enabled 1 ; Host notifications are enabled

event_handler_enabled 1 ; Host event handler is enabled

flap_detection_enabled 1 ; Flap detection is enabled

failure_prediction_enabled 1 ; Failure prediction is enabled

process_perf_data 1 ; Process performance data

retain_status_information 1 ; Retain status information across program restarts

retain_nonstatus_information 1 ; Retain non-status information across program restarts

notification_period 24x7 ; Send host notifications at any time

register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!

}

# Linux host definition template - This is NOT a real host, just a template!

DEI-FCTUC

Nagios 10

Page 12: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

define host{

name linux-server ; The name of this host template

use generic-host ; This template inherits other values from the generic-host template

check_period 24x7 ; By default, Linux hosts are checked round the clock

check_interval 5 ; Actively check the host every 5 minutes

retry_interval 2 ; Schedule host check retries at 1 minute intervals

max_check_attempts 10 ; Check each Linux host 10 times (max)

check_command check-host-alive ; Default command to check Linux hosts

notification_period 24x7 ; Send notification out at any time - day or night

notification_interval 1 ; Resend notifications every 30 minutes

notification_options d,r ; Only send notifications for specific host states

contact_groups admins,sms ; Notifications get sent to the admins by default

hostgroups linux-servers

register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!

}

# Windows host definition template - This is NOT a real host, just a template!

define host{

name windows-server ; The name of this host template

use generic-host ; Inherit default values from the generic-host template

check_period 24x7 ; By default, Windows servers are monitored round the clock

check_interval 5 ; Actively check the server every 5 minutes

retry_interval 2 ; Schedule host check retries at 1 minute intervals

max_check_attempts 10 ; Check each server 10 times (max)

check_command check-host-alive ; Default command to check if servers are "alive"

notification_period 24x7 ; Send notification out at any time - day or night

notification_interval 1 ; Resend notifications every 30 minutes

notification_options d,r ; Only send notifications for specific host states

contact_groups admins,sms ; Notifications get sent to the admins by default

hostgroups windows-servers ; Host groups that Windows servers should be a member of

register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE

}

DEI-FCTUC

Nagios 11

Page 13: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

# We define a generic printer template that can be used for most printers we monitor

define host{

name generic-printer ; The name of this host template

use generic-host ; Inherit default values from the generic-host template

check_period 24x7 ; By default, printers are monitored round the clock

check_interval 5 ; Actively check the printer every 5 minutes

retry_interval 1 ; Schedule host check retries at 1 minute intervals

max_check_attempts 10 ; Check each printer 10 times (max)

check_command check-host-alive ; Default command to check if printers are "alive"

notification_period workhours ; Printers are only used during the workday

notification_interval 30 ; Resend notifications every 30 minutes

notification_options d,r ; Only send notifications for specific host states

contact_groups admins ; Notifications get sent to the admins by default

register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE

}

# Define a template for switches that we can reuse

define host{

name generic-switch ; The name of this host template

use generic-host ; Inherit default values from the generic-host template

check_period 24x7 ; By default, switches are monitored round the clock

check_interval 5 ; Switches are checked every 5 minutes

retry_interval 1 ; Schedule host check retries at 1 minute intervals

max_check_attempts 10 ; Check each switch 10 times (max)

check_command check-host-alive ; Default command to check if routers are "alive"

notification_period 24x7 ; Send notifications at any time

notification_interval 5 ; Resend notifications every 30 minutes

notification_options d,r ; Only send notifications for specific host states

contact_groups admins,sms ; Notifications get sent to the admins by default

register 0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE

}

DEI-FCTUC

Nagios 12

Page 14: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

###############################################################################

###############################################################################

#

# SERVICE TEMPLATES

#

###############################################################################

###############################################################################

# Generic service definition template - This is NOT a real service, just a template!

define service{

name generic-service ; The 'name' of this service template

active_checks_enabled 1 ; Active service checks are enabled

passive_checks_enabled 1 ; Passive service checks are enabled/accepted

parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)

obsess_over_service 1 ; We should obsess over this service (if necessary)

check_freshness 0 ; Default is to NOT check service 'freshness'

notifications_enabled 1 ; Service notifications are enabled

event_handler_enabled 1 ; Service event handler is enabled

flap_detection_enabled 1 ; Flap detection is enabled

failure_prediction_enabled 1 ; Failure prediction is enabled

process_perf_data 1 ; Process performance data

retain_status_information 1 ; Retain status information across program restarts

retain_nonstatus_information 1 ; Retain non-status information across program restarts

is_volatile 0 ; The service is not volatile

check_period 24x7 ; The service can be checked at any time of the day

max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state

normal_check_interval 10 ; Check the service every 10 minutes under normal conditions

retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined

contact_groups admins ; Notifications get sent out to everyone in the 'admins' group

notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events

notification_interval 60 ; Re-notify about service problems every hour

notification_period 24x7 ; Notifications can be sent out at any time

DEI-FCTUC

Nagios 13

Page 15: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

}

# Local service definition template - This is NOT a real service, just a template!

define service{

name local-service ; The name of this service template

use generic-service ; Inherit default values from the generic-service definition

max_check_attempts 4 ; Re-check the service up to 4 times in order to determine its final (hard) state

normal_check_interval 5 ; Check the service every 5 minutes under normal conditions

retry_check_interval 1 ; Re-check the service every minute until a hard state can be determined

register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

}

Gateways:

Para a monitorização do switch 3com e do gtdei.dei.uc.pt utilizámos a seguinte configuração:

###########################################################################

###########################################################################

#

# HOST DEFINITIONS

#

###########################################################################

###########################################################################

# Define the switch that we'll be monitoring

define host{

use generic-switch ; Inherit default values from a template

host_name gtdei.dei.uc.pt ; The name we're giving to this switch

alias gtdei.dei.uc.pt Switch ; A longer name associated with the switch

address 193.137.203.254 ; IP address of the switch

hostgroups switcheandrouters ; Host groups this switch is associated with

}

define host{

use generic-switch ; Inherit default values from a template

host_name 3com ; The name we're giving to this switch

alias switch sala de redes Switch ; A longer name associated with the switch

address 10.254.0.200 ; IP address of the switch

hostgroups switcheandrouters ; Host groups this switch is associated with

DEI-FCTUC

Nagios 14

Page 16: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

}

###############################################################################

###############################################################################

#

# HOST GROUP DEFINITIONS

#

###############################################################################

###############################################################################

# Create a new hostgroup for switches

define hostgroup{

hostgroup_name switcheandrouters ; The name of the hostgroup

alias Network Switches and routers ; Long name of the group

}

###############################################################################

###############################################################################

#

# SERVICE DEFINITIONS

#

###############################################################################

###############################################################################

# Create a service to PING to switch

define service{

use generic-service ; Inherit values from a template

host_name gtdei.dei.uc.pt ; The name of the host the service is associated with

service_description PING ; The service description

check_command check_ping!200.0,20%!600.0,60% ; The command used to monitor the service

normal_check_interval 5 ; Check the service every 5 minutes under normal conditions

retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined

}

define service{

use generic-service ; Inherit values from a template

host_name 3com ; The name of the host the service is associated with

service_description PING ; The service description

check_command check_ping!200.0,20%!600.0,60% ; The command used to monitor the service

normal_check_interval 5 ; Check the service every 5 minutes under normal conditions

retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined

}

DEI-FCTUC

Nagios 15

Page 17: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

# Monitor uptime via SNMP

define service{

use generic-service ; Inherit values from a template

host_name gtdei.dei.uc.pt

service_description Uptime

check_command check_snmp!-C public -o sysUpTime.0

}

define service{

use generic-service ; Inherit values from a template

host_name 3com

service_description Uptime

check_command check_snmp!-C public -o sysUpTime.0

}

# Monitor Port 3 status via SNMP

define service{

use generic-service ; Inherit values from a template

host_name gtdei.dei.uc.pt

service_description Port 3 Link Status

check_command check_snmp!-C public -o ifOperStatus.5 -r 1 -m RFC1213-MIB

}

# Monitor Port 5 status via SNMP

define service{

use generic-service ; Inherit values from a template

host_name gtdei.dei.uc.pt

service_description Port 5 Link Status

check_command check_snmp!-C public -o ifOperStatus.7 -r 1 -m RFC1213-MIB

}

# Monitor Port 115 and 122 status via SNMP (3com)

define service{

use generic-service ; Inherit values from a template

host_name 3com

service_description Port 198 Link Status

check_command check_snmp!-C public -o ifOperStatus.198 -r 1 -m RFC1213-MIB

}

define service{

use generic-service ; Inherit values from a template

DEI-FCTUC

Nagios 16

Page 18: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

host_name 3com

service_description Port 199 Link Status

check_command check_snmp!-C public -o ifOperStatus.199 -r 1 -m RFC1213-MIB

}

# Monitor bandwidth via MRTG logs

define service{

use generic-service ; Inherit values from a template

host_name gtdei.dei.uc.pt

service_description Port 3 Bandwidth Usage

check_command check_local_mrtgtraf!/var/www/mrtg/gtdei.dei.uc.pt_3.log!AVG!1000000,1000000!5000000,5000000!10

}

define service{

use generic-service ; Inherit values from a template

host_name gtdei.dei.uc.pt

service_description Port 5 Bandwidth Usage

check_command check_local_mrtgtraf!/var/www/mrtg/gtdei.dei.uc.pt_5.log!AVG!1000000,1000000!5000000,5000000!10

}

define service{

host_name 3com

service_description TRAP

is_volatile 1

check_command check-host-alive

max_check_attempts 1

normal_check_interval 1

retry_check_interval 1

active_checks_enabled 0

passive_checks_enabled 1

check_period 24x7

notification_interval 1

notification_period 24x7

notification_options u,c,w

notifications_enabled 1

contact_groups admins

}

Note-se que para a verificação de bandwidth foram utilizados os comandos específicos para ler logs de MRTG, enquanto

que para todos os outros foi utilizado SNMP.

Repare-se também que embora o nome das portas no gateway sejam 3 e 5, os seus respectivos interfaces na máquina são

o 5 e o 7.

Metemos também 2 portas do switch (a 199 e a 198) visto estarmos a utilizar unicamente máquinas virtuais neste trabalho.

Finalmente, o código de traps está ligado ao grupo 3com.

DEI-FCTUC

Nagios 17

Page 19: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Windows

Para a máquina windows utilizámos o seguinte config:

###############################################################################

###############################################################################

#

# HOST DEFINITIONS

#

###############################################################################

###############################################################################

# Define a host for the Windows machine we'll be monitoring

# Change the host_name, alias, and address to fit your situation

define host{

use windows-server ; Inherit default values from a template

host_name winserver ; The name we're giving to this host

alias My Windows Server ; A longer name associated with the host

address 172.16.168.128 ; IP address of the host

}###############################################################################

###############################################################################

#

# HOST GROUP DEFINITIONS

#

###############################################################################

###############################################################################

# Define a hostgroup for Windows machines

# All hosts that use the windows-server template will automatically be a member of this group

define hostgroup{

hostgroup_name windows-servers ; The name of the hostgroup

alias Windows Servers ; Long name of the group

}

###############################################################################

###############################################################################

#

# SERVICE DEFINITIONS

#

###############################################################################

###############################################################################

# Create a service for monitoring the version of NSCLient++ that is installed

# Change the host_name to match the name of the host you defined above

define service{

use generic-service

host_name winserver

DEI-FCTUC

Nagios 18

Page 20: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

service_description C: disk space

check_command check_snmp!-C public -o .1.3.6.1.2.1.25.2.3.1.6.2 -w 404844 -c 704844 -u 'x4096_for_bytes'

}

define service{

use generic-service

host_name winserver

service_description CPU load

check_command check_snmp!-C public -o .1.3.6.1.2.1.25.3.3.1.2.1 -u '%' -w 50 -c 70

}

define service{

use generic-service

host_name winserver

service_description Memory Used bytes

check_command check_snmp!-C public -o .1.3.6.1.4.1.9600.1.1.2.4.0 -u 'Bytes'

}

define service{

use generic-service

host_name winserver

service_description Memory Free bytes

check_command check_snmp!-C public -o .1.3.6.1.4.1.9600.1.1.2.1.0 -u 'Bytes'

}

define service{

host_name winserver

service_description TRAP

is_volatile 1

check_command check-host-alive

max_check_attempts 1

normal_check_interval 1

retry_check_interval 1

active_checks_enabled 0

passive_checks_enabled 1

check_period 24x7

notification_interval 1

notification_period 24x7

notification_options u,c,w

notifications_enabled 1

contact_groups admins

}

Embora o nagios permita a utilização do NSclient++ para a verificação do estado dos serviços, era pedido a utilização de

snmp. Como não encontrámos um valor na mib default que retratasse fielmente a ram em uso, instalamos uma mib externa

(****.9600.****) que permitiu a obtenção de valores mais exactos.

DEI-FCTUC

Nagios 19

Page 21: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Linux

A configuração da máquina linux é extremamente semelhante, variando apenas o endereço nas MIBs das várias features

(utilizámos a UCD-SNMP-MIB). De notar que além da monitorização também utilizámos vários comandos para verificar a

actividade de serviços (processo que será explicado mais à frente).

###############################################################################

###############################################################################

#

# HOST DEFINITIONS

#

###############################################################################

###############################################################################

# Define a host for the Linux machine we'll be monitoring

# Change the host_name, alias, and address to fit your situation

define host{

use linux-server ; Inherit default values from a template

host_name lserver ; The name we're giving to this host

alias My Linux Server ; A longer name associated with the host

address 172.16.168.130 ; IP address of the host

}

###############################################################################

###############################################################################

#

# HOST GROUP DEFINITIONS

#

###############################################################################

###############################################################################

# Define a hostgroup for Linux machines

# All hosts that use the linux-server template will automatically be a member of this group

###############################################################################

###############################################################################

#

# SERVICE DEFINITIONS

#

###############################################################################

###############################################################################

define service{

use generic-service

host_name lserver

service_description disk space used on / percentage

check_command check_snmp!-C public -o .1.3.6.1.4.1.2021.9.1.9.1 -u '%'

DEI-FCTUC

Nagios 20

Page 22: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

}

define service{

use generic-service

host_name lserver

service_description disk space used on / kB

check_command check_snmp!-C public -o .1.3.6.1.4.1.2021.9.1.8.1 -u 'kB'

}

define service{

use generic-service

host_name lserver

service_description disk space free on / kB

check_command check_snmp!-C public -o .1.3.6.1.4.1.2021.9.1.7.1 -u 'kB'

}

define service{

use generic-service

host_name lserver

service_description CPU load

check_command check_snmp!-C public -o .1.3.6.1.4.1.2021.11.10.0 -u '%' -w 50 -c 70

}

define service{

use generic-service

host_name lserver

service_description Total Free Memory kbytes

check_command check_snmp!-C public -o .1.3.6.1.4.1.2021.4.11.0 -u 'kBytes'

}

define service{

use generic-service

host_name lserver

service_description Available Memory kbytes

check_command check_snmp!-C public -o .1.3.6.1.4.1.2021.4.6.0 -u 'kBytes'

}

define service{

use generic-service

host_name lserver

service_description POP sem SSL

check_command check_pop

}

define service{

DEI-FCTUC

Nagios 21

Page 23: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

use generic-service

host_name lserver

service_description POP via SSL

check_command check_pop!-S -p 995

}

define service{

use generic-service

host_name lserver

service_description IMAP sem SSL

check_command check_imap

}

define service{

use generic-service

host_name lserver

service_description IMAP via SSL

check_command check_imap!-S -p 993

}

define service{

use generic-service

host_name lserver

service_description SSH

check_command check_ssh

}

define service{

use generic-service

host_name lserver

service_description SMTP

check_command check_smtp

}

define service{

use generic-service

host_name lserver

service_description HTTP

check_command check_http

}

define service{

use generic-service

host_name lserver

service_description HTTPS

check_command check_http!-S

}

DEI-FCTUC

Nagios 22

Page 24: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

A obtenção do estado dos serviços foi primeiro testado na consola recorrendo aos plugins do nágios:

HTTP:

./check_http -H 172.16.168.130 -S

SMTP:

./check_smtp -H 172.16.168.130

IMAP:

./check_imap -H 172.16.168.130

IMAPS:

./check_imap -H 172.16.168.130 -S -p993

POP3:

./check_pop -H 172.16.168.130

POP3S

./check_pop -H 172.16.168.130 -S -p995

SSH

./check_ssh -H 172.16.168.130

Estes comandos foram posteriormente testados e traduzidos para verificarem os respectivos serviços

Contacts

Para os dois tipos de contacto usado foram criados dois grupos, um para sms’s e outro para emails normais

###############################################################################

###############################################################################

#

# CONTACTS

#

###############################################################################

###############################################################################

define contact{

contact_name nagiosadmin ; Short name of user

use generic-contact ; Inherit default values from generic-contact template (defined above)

alias Nagios Admin ; Full name of user

email [email protected] ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******

}

define contact{

contact_name nagiossms

use generic-sms

DEI-FCTUC

Nagios 23

Page 25: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

alias SMS Warnings

email [email protected]

}

###############################################################################

###############################################################################

#

# CONTACT GROUPS

#

###############################################################################

###############################################################################

define contactgroup{

contactgroup_name admins

alias Nagios Administrators

members nagiosadmin

}

define contactgroup{

contactgroup_name sms

alias SMS-email

members nagiossms

}

Commands

Embora de início tivéssemos algumas plugins custom, acabámos por utilizar apenas os plugins default, não sendo por isso

necessário por aqui o ficheiro de configuração completo. A única alteração que utilizámos foi a criação de um comando

para enviar sms’s:

define command{

command_name notify-host-by-sms

command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -r [email protected] -s "918551620" $CONTACTEMAIL$

}

Este comando, basicamente, envia uma sms com o estado do serviço a partir do e-mail de um de nós com o número de

telefone de destino no campo “To:”. O e-mail de contacto foi definido na parte contacts.

DEI-FCTUC

Nagios 24

Page 26: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Testes

Estas duas figuras mostram que os primeiro requisitos do trabalho foi preenchido: É acedido pelo url http://gsrtp3.dei.uc.pt/

nagios e http://gsrtp3.dei.uc.pt/mrtg e é pedida uma password de acesso previamente configurada.

DEI-FCTUC

Nagios 25

Page 27: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Estas três imagens mostram o mapa e os hosts que estão a ser monitorizados. A terceira entra em mais detalhe sobre

todos os serviços de cada 1.

DEI-FCTUC

Nagios 26

Page 28: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Esta imagem mostra uma visão geral de todos os serviços monitorizados.

Esta imagem mostra em detalhe o switch 3com. Como não recebemos nenhuma trap, esta aparece em cinzento.

Detalhe do router de gtdei.dei.uc.pt. O estado crítical na porta 5 deve-se ao threshold definido nos fichiros de configuração,

que depois mantivémos como forma de teste de avisos dos serviços.

DEI-FCTUC

Nagios 27

Page 29: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Nesta figura vê-se o estado do servidor linux e de todos os seus serviços. O Status information confirma que todos os

comandos estão a ser executados correctamente.

Nesta figura mostra-se o servidor windows com um serviço em “warning”, definido propositadamente nos thresholds do

comando de forma a testar.

DEI-FCTUC

Nagios 28

Page 30: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

O mib browser permite o envio de traps com um interface extremamente intuitivo, neste caso enviámos uma do tipo

ColdStart.

Nesta imagem vemos o processamento do smptrapd da trap, que por sua vez envia para smptt a trap que a reconhece

como uma das que tratámos. Is gera um evento do tipo EXTERNAL COMMAND já explicado mais acima.

De seguida, é feita a verificação do serviço pelo nagios o que, devido ao estado “Critical”, envia um aviso via e-mail para o

admin.

DEI-FCTUC

Nagios 29

Page 31: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

Finalmente vemos a apresentação da trap do tipo coldStart (funcionalidade supostamente para o switch 3com, mas testada

a partir do windows devido a podermos enviar traps a pedido) no ecrã do nágios.

Aqui temos um exemplo da página de notificações. Repare-se que o services recebem avisos unicamente por e-mail,

equanto o estado dos hosts é também enviado por sms.

DEI-FCTUC

Nagios 30

Page 32: GSR - TP3 Monitorização com NAGIOS e MRTG utilizando ...jdpneto/gsr-relatorio-pages-tp3.pdf · GSR - TP3 Monitorização com NAGIOS e MRTG utilizando SNMP e RMON José Neto - 2001113213

E finalmente temos aqui 2 exemplos de avisos recebidos, os da esquerda por SMS e o da direita por e-mail.

DEI-FCTUC

Nagios 31