58
Sara Moggi ([email protected]) IBM Tivoli Agentless Monitoring Overview

IBM Tivoli Agentless Monitoring

Embed Size (px)

Citation preview

Page 1: IBM Tivoli Agentless Monitoring

Sara Moggi ([email protected])

IBM Tivoli Agentless Monitoring Overview

Page 2: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation2

Agenda

Overview

Installation

Configuration

Workspaces Reference

Troubleshooting– Log Files

– Problem determination

Known Problems/APARS

Q&A

Page 3: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation3

Agent and Agentless technology

Agent-based technology resides directly on a managed server

Agentless technology resides primarily on a management server and gets its data via a remote application programming interface (API)

Page 4: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation4

Agent and Agentless technology

The IBM Tivoli Monitoring product uses agent and agentless technology.

– Agent Technology:

– Database agents– Operating system agents, etc.

– Agentless Tecnology

– Customer Agent or Agentless Solution with Universal Agent/Agent Builder

– ITM for Virtual Servers– ITM for Applications (SAP)– Operating Systems (starting from ITM 6.21 version)

Page 5: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation5

ITM Agentless

IBM Tivoli Monitoring Agentless provides a way to monitor the availability and performance of all the systems in your enterprise from one or several designated workstations

– Agentless Monitoring for Windows Operating Systems (r2)

– Agentless Monitoring for AIX Operating Systems (r3)

– Agentless Monitoring for Linux Operating Systems (r4)

– Agentless Monitoring for HP-UX Operating Systems (r5)

– Agentless Monitoring for Solaris Operating Systems (r6)

Page 6: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation6

Key Features (1/2)

It is possible to monitor an IT environment from a small set of workstations.

It is not needed to install an agent on each box we want to monitor.

Agentless provides a more flexible monitoring solution.

Page 7: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation7

Key Features (2/2)

Tivoli Monitoring Agentless can monitor multiple operating system nodes that do not have standard OS agents running on them.

An Agentless obtains data from nodes that are monitored via:

– SNMP (Simple Network Management Protocol)

– CIM (Common Information Model)

– WMI (Windows Management Instrumentation)

Page 8: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation8

Agenda

Overview

Installation

Configuration

Workspaces Reference

Troubleshooting– Log Files

– Problem determination

Known Problems/APARS

Q&A

Page 9: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation9

Agentless Monitoring Installation Supported Platform

– AIX 5.3 (32/64 bit) or AIX 6.1 (64 bit)

– Solaris 9 or higher

– HP-UX 11i or higher

– Windows:

– Windows 2003 Server SE (32/64 bit)

– Windows Server 2003 Datacenter Edition

– Windows Vista Enterprise, Business and Ultimate (32/64 bit)

– Windows Server 2008 SE (32/64 bit)

– Windows Server 2008 EE (32/64 bit)

– Windows Server 2008 Data Center

– Windows Server 2008 Data Center (64 bit)

– Linux

– Red Hat Enterprise Linux 4 or higher

– SUSE Linux Enterprise Server 9 or highter

Page 10: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation10

Agentless Monitoring Installation

When you use the Agent DVD in a Windows box, then you need to select the Agenless features in the following screen:

Page 11: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation11

Agentless Monitoring Installation

On the other hand, in a unix box, when you run the install.sh, you need to select the agentless from the following menu:

Page 12: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation12

Agenda

Overview

Installation

Configuration

Workspaces Reference

Troubleshooting– Log Files

– Problem determination

Known Problems/APARS

Q&A

Page 13: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation13

Agentless Monitoring Configuration

We can perform agentless configuration by using different sources:

– Manage Tivoli Enterprise Monitoring Services (MTEMS)

– itmcmd command

– tacmd command

– TEP gui : right click on the agentless icon and then click on Configure option

Page 14: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation14

Agentless Monitoring Configuration

It is possible to configure the Agentless to collect all the data from the monitored box using the SNMP protocol (SNMP Version 1, SNMP Version 2c or SNMP Version 3)

The Agentless for Solaris provides an additional possibility: CIM

The Agentless for Windows provides an additional possibility: WMI

Page 15: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation15

Agentless Monitoring Configuration

For each agentless, as soon as you start the configuration you’ll see the following window:

Page 16: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation16

Agentless Monitoring Configuration

Agentless Monitoring for AIX

Page 17: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation17

Agentless Monitoring Configuration

Agentless Monitoring for AIX

Page 18: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation18

Agentless Monitoring Configuration

Agentless Monitoring for AIX

Page 19: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation19

Agentless Monitoring Configuration

Agentless Monitoring for Solaris: CIM

Page 20: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation20

Agentless Monitoring Configuration

Agentless Monitoring for Solaris: CIM

Page 21: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation21

Agentless Monitoring Configuration

Agentless Monitoring for Solaris: CIM

Page 22: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation22

Agentless Monitoring Configuration

Agentless Monitoring for Windows: WMI

Page 23: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation23

Agentless Monitoring Configuration

Agentless Monitoring for Windows: WMI

Page 24: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation24

Agentless Monitoring Configuration

Agentless Monitoring for Windows: WMI

Page 25: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation25

Agentless Monitoring Configuration

Agentless Monitoring for Windows: WMI

Page 26: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation26

Agentless Monitoring Configuration

Relationship between Managed System Details and TEP gui tree

Page 27: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation27

Agentless Monitoring Configuration

Relationship between Managed System Details and TEP gui tree

Page 28: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation28

Agenda

Overview

Installation

Configuration

Workspaces Reference

Troubleshooting– Log Files

– Problem determination

Known Problems/APARS

Q&A

Page 29: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation29

Workspaces Reference

Agentless Linux OS

– contains agent instance level workspaces.

SNMP Linux Systems: LNX subnode

– Each node is an individual server.

Page 30: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation30

Workspaces Reference

Agentless Navigator Item

– lists the collection status of the managed systems, and lists which systems are being monitored

– Agentless for Windows: two views that list the Windows systems that are monitored through the SNMP and the WMI subnode

– Agentless for Solaris: three views that list the Solaris systems that are monitored through the SNMP subnode (Sun Management Center or System Management Agent) and the CIM subnode.

Page 31: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation31

Workspaces Reference

Metrics collected by the Agentless Monitoring:

– Disk utilization

– Physical and Virtual Memory

– Network Interface

– Processes running

– Processor capacity of the system

– System level

Page 32: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation32

Agenda

Overview

Installation

Configuration

Workspaces Reference

Troubleshooting– Log Files

– Problem determination

Known Problems/APARS

Q&A

Page 33: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation33

Log Files

RAS1 Logs

– Windows:

%CANDLE_HOME%\TMAITM6\logs\<hostname>_<pc>_k<pc>agent_<instance>_<timestamp>-nn.log

– Unix/Linux:

$CANDLE_HOME/logs/<hostname>_<pc>_<instance>_<timestamp>-nn.log

– where:

– pc is the product code of the specific Agentless monitoring

Page 34: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation34

Trace Levels Startup/Initialization problems

– ERROR (UNIT:query ALL) Windows

– ERROR (UNIT:ct_main ALL) Unix/Linux

WMI Data Provider– ERROR (UNIT:WMI ALL)

Perfmon Data Provider– ERROR (UNIT: QueryClass ALL)

SNMP Data Provider– ERROR (UNIT:SNMP ALL)

Windows Event Log Data Provider– ERROR (UNIT:EventLog ALL) (UNIT:WinLog ALL)

CIM-XML Data Provider– ERROR (UNIT:CIM ALL)

Page 35: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation35

Problem Scenario 1

Using the itmcmd command to start agentless we obtained the following error message:

KCIIN0201E Specified product is not configured.

We need to use the following command, including –o option:

itmcmd agent -o SNMP start r4

Page 36: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation36

Problem Scenario 2

Agentless Monitoring configured to use the SNMP Blank workspaces

In the agent log you find error messages as the following:

Check if the snmpd process is running

Page 37: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation37

Problem Scenario 3

Agentless monitoring for AIX

– Agentless is configured to collect the data using SNMP data provider

– The only blank workspaces are Disk and Process

By default in AIX 5.x, 6.x, the aixmibd daemon is excluded access to the MIBD

Modify the /etc/snmpdv3.conf file to comment out a line that is excluding access.

# exclude aixmibd managed MIBs from the default view#VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 -excluded-

Page 38: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation38

Problem Scenario 4 (1/2)

Agentless monitoring for Linux or Solaris

– Agentless is configured to collect the data using SNMP data provider

– All the workspaces are blank

– In the agentless trace logs we found messages as the following:

Page 39: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation39

Problem Scenario 4 (2/2)

Check connectivity with the monitored box:

– Make sure you can ping the remote system

– Check about firewalls that are blocking communications on the SNMP port (UDP 161)

– Check community string and passwords specified in the Agentless configuration

– Check the SNMP system is not restricting access to localhost (see snmpd.conf file)

– Run the following command to check the connectivity with the SNMP system:

– snmpwalk –c public –v 1 <hostIP>

– Check the MIB branches are not restricted (see snmpd.conf file)

Page 40: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation40

Problem Scenario 5 (1/2)

Agentless Monitoring for AIX

– Agentless is configured to collect the data using SNMP data provider

– Some workspaces are blank

In AIX, the SNMP daemon is composed by 4 processes:

– snmpd System workspaces

– aixmibd Disk and File System Capacity, Volume Group, Logical Volume, Physical Volume, Page System, Process

Availability, and User Account Information workspaces

– hostmibd Memory and Processor workspaces

– snmpmibd Network workspace

Page 41: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation41

Problem Scenario 5 (2/2)

Check all the snmp processes are running

If the community string is not public, verify that the three SNMP processes: aixmibd, hostmibd and snmpmibd are started with the -c <community> command line option

Page 42: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation42

Problem Scenario 6

Agentless Monitoring for Linux

– Agentless is configured to collect the data using SNMP data provider

– On TEP gui, there are data only for Network and System workspaces

For Red Hat operating systems, the /etc/snmpd.conf must be modified to allow the Host Resources MIB and ucdavis MIB to be viewed by all users.

Add the following system views to the SNMP configuration:– view systemview included .1.3.6.1.2.1.25

– view systemview included .1.3.6.1.4.1.2021

Page 43: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation43

Problem Scenario 7

Agentless Monitoring for Windows

(48E294C2.00011D28:queryclass.cpp,803,"internalCollectData") Authentication failed against host <host> as user <Domain>\<User>, return code = 1326

(48E294C3.000016EC:wmiqueryclass.cpp,757,"internalCollectData") ::collectData==>Could not connect. Error code = 0x80070005, subnode = <name>

These errors indicate an invalid password, invalid username, or username without Administrators group membership.

Page 44: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation44

Problem Scenario 8

Agentless Monitoring for Windows

(4891C694.0066-1558:queryclass.cpp,1006,"start") Error adding query for class PhysicalDisk.

(4891C694.0067-1558:queryclass.cpp,1007,"start") \\<hostname box>\PhysicalDisk(*)\% Disk Write Time - add returned C0000BB8

Check if the counter exists

Check if the Remote Registry service is enabled.

Check if the counter indexes are corrupted. You need to check in HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Perflib\009

Page 45: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation45

Agenda

Overview

Installation

Configuration

Workspaces Reference

Troubleshooting– Log Files

– Problem determination

Known Problems/APARS

Q&A

Page 46: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation46

Known Problems/APARS 1 (1/2)

IZ80454: CPU metrics for Solaris SMA should be per interval

Recreation Steps:– Install the Agentless Monitoring for Solaris (KR6) product.

– Configure it to use "SNMP (System Management Agent)" or SMA.

– In Tivoli Enterprise Portal (TEP), click on the Processor workspace.

– In the "Overall CPU Utilization over Time",select to have data viewed in table format.

Symptom: for each polling interval the values of the following attributes increase over time:

– User CPU, System CPU, Nice CPU, Idle CPU

– Total CPU, CPU Used Pct, CPU Idle Pct

Page 47: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation47

Known Problems/APARS 1 (2/2)

The following CPU metrics are collected via SNMP and the values are cumulative since the machine was started:

– User CPU, System CPU, Nice CPU, Idle CPU

These values are used in the calculations of the following attributes:– Total CPU, CPU Used Pct, CPU Idle Pct

The CPU metrics were changed to represent the values since the last polling interval instead of a cumulative value since the machine was last rebooted.

Fixed in 6.2.2-TIV-ITM-FP0003

Page 48: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation48

Known Problems/APARS 2

IZ71871: Add ability to monitor services that are down

The following functions have been added to the agent:– a new query;

– a new workspace named Windows Services under the System navigator item;

– a new attribute group with several attributes that will provide information about the services. For example: Display Name, Description, Process ID, Status, State.

Fixed in 6.2.1-TIV-ITM-FP0002 and 6.2.2-TIV-ITM-FP0002

Page 49: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation49

Known Problems/APARS 3 (1/2)

IZ77565: Perfmon data stops collecting when other systems are down

When one or more remote hosts go down, the calls to perfmon for other remote hosts (which are up) fail with a timeout error:

(4BCF113E.0000-400:queryclass.cpp,1001,"internalCollectData“) Errorcollecting query data for class Terminal Services host <host>. Error is 00000102.

(4BCF113E.0001-744:queryclass.cpp,1001,"internalCollectData") Errorcollecting query data for class Terminal Services host <host> Error is 00000102.

Page 50: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation50

Known Problems/APARS 3 (2/2)

The Microsoft code is single-threaded in the API call the agent calls. When a system is down, the call to collect the perfmon data will time-out for the system that is down. All other requests queued up at the time (for the same system or other systems) will also time-out.

The agent code was updated to ensure the call to the Microsoft API is also single-threaded.

Fixed in 6.2.1-TIV-ITM-FP0003

Page 51: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation51

Known Problems/APARS - Missing Operator

Example:R2 Agentless running on a Linuxsystem, and monitoring a remoteWindows system

<instance>:<hostname>:R2– R2:<WindowsHost>:WIN

Page 52: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation52

Known Problems/APARS - Missing Operator

Define a situation that monitors if a process is MISSING on the monitored Windows box

When the situation will be true, then we’ll have an alert on <instance>: <hostname>:R2 node

Enhancement Requests:– MR1126096811

– MR0109086845

– MR0204095945

DCF http://www-01.ibm.com/support/docview.wss?uid=swg21420788

Page 53: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation53

Known Problems/APARS - Missing Operator

Workaround: Agent Builder solution

Agent which remotely monitor one machine

Agent built with the multi-instance support

Agent to monitor these metrics on Windows:

– Computer System including Model and Serial Number, Operating System

– Windows Event Log

– Disk Usage including Logical Disk, Physical Disk

– Processor

– Memory including Physical Memory,Page File Usage

– Network Interfaces

– Windows Terminal Services

Also available for AIX, HP-UX, Solaris and Linux platforms

Page 54: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation54

Known Problems/APARS - Missing Operator

In the Agent package you can find the following scripts:

– installIra.bat/sh (installs all components on a single machine)

– installIraAgent.bat/sh (installs the Agent)

– installIraAgentTEMS.bat/sh (installs the TEMS application support)

– installIraAgentTEPS.bat/sh (installs the TEPS application support)

Page 55: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation55

Known Problems/APARS - Missing Operator

Page 56: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation56

Agentless Monitoring Scale Information

Agentless Monitors are multi-instance agents

Support for up to 10 active instances on a single system

Each instance supports communication with 100 remote nodes

– 10 instances x 100 remote nodes = 1000 monitored systems

Page 57: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation57

Performance VariablesVariable Name Default Value Description

CDP_DP_CACHE_TTL 60 Time in seconds before a query will trigger a new data collection

CDP_DP_THREAD_POOL_SIZE 60 The number of threads created to perform background data collections. The Thread Pool is shared among all attribute groups in all remote nodes in an agent.

CDP_DP_REFRESH_INTERVAL 60 The interval in seconds at which each attribute group cache is updated in the background

CDP_DP_IMPATIENT_COLLECTOR_TIMEOUT 2 The number of seconds to wait for a data collection to happen before timing out and returning cached data.

CDP_SNMP_RESPONSE_TIMEOUT 2 The number of seconds to wait for each request to time out. Each row in an attribute group is a separate request

CDP_SNMP_MAX_RETRIES 2 The number of times to retry sending the SNMP request after a response timeout

CDP_NT_EVENT_LOG_GET_ALL_ENTRIES_FIRST_TIME

NO Configures whether or not the Windows Event Log data provider should report old log entries on startup, or only new ones

CDP_NT_EVENT_LOG_CACHE_TIMEOUT 3600 Cache lifetime in seconds of an event from the Windows Event Log

CDP_PURE_EVENT_CACHE_SIZE 100 Number of pure events held in cache at any one time. When a query is made, reports all events in the cache at that time. When cache is full, oldest events are removed to make room for new ones

Page 58: IBM Tivoli Agentless Monitoring

© 2010 IBM Corporation58

QUESTIONS