39
InfiniBand diagnostics tools HPC Advisory Council Switzerland Workshop March 21-23, 2011 Erez Cohen - Sr. Director of Field Engineering

InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

InfiniBand diagnostics tools

HPC Advisory Council

Switzerland Workshop

March 21-23, 2011

Erez Cohen - Sr. Director of Field Engineering

Page 2: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 2

OFED Tools

Page 3: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 3

IBDIAG and other OFA tools

Single Node SRC/DST Pair Network

Ibdiagnet

ibnetdiscover

ibhosts

Ibswitches

saquery

sminfo

smpdump

Ibdiagpath

ibtracert

ibv_rc_pingpong

ibv_srq_pingpong

ibv_ud_pingpong

ib_send_bw

ib_write_bw

ibv_devinfo

ibstat

Ibportstate

ibroute

smpquery

perfquery

Page 4: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 4

ibstat • displays basic information obtained from the local IB driver.

• Normal output includes Firmware version, GUIDS, LID, SMLID, port

state, link width active, and port physical state.

• Has options to list CAs and/or Ports.

ibv_devinfo • Reports similar information to ibstat

• Also includes PSID and an extended verbose mode (-v).

/sys/class/infiniband • File system which reports driver and other ULP information.

- e.g. [root@ibd001 /]# cat /sys/class/infiniband/mlx4_0/board_id

MT_04A0110002

HCA Device information

Page 5: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 5

perfquery • Obtains and/or clears the basic performance and error counters from the

specified node

• Can be used to check port counters of any port in the cluster using

„perfquery <lid> <port number>‟

ibportstate • Query, change state (i.e. disable), or speed of Port

- ibportstate 38 1 query

ibroute

• Dumps routes within a switch

smpquery • Dump SMP query parameters, including:

- nodeinfo, nodedesc, switchinfo, pkeys, sl2vl, vlarb, guids

Node management utilities

Page 6: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 6

ibswitches

• Lists all switches in cluster

ibhosts

• Lists all HCAs in cluster

ibtracert

• Shows path between two lids

- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489

From ca {0x0002c90300001480} portnum 1 lid 12-12 "ibd017 HCA-1"

[1] -> switch port {0x000b8cffff002772}[5] lid 39-39 "MT47396 Infiniscale-III Mellanox Technologies"

[6] -> ca port {0x0002c90300001489}[1] lid 15-15 "ibd012 HCA-1"

To ca {0x0002c90300001488} portnum 1 lid 15-15 "ibd012 HCA-1"

Cluster utilities

Page 7: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 7

Integrated diagnostic tools

• Queries cluster topology and indicates any port errors, link width, or link speed

mismatch.

• Automates calls to many “low level” operations

Easy to use

• Similar flags, logs and reports for both tools

• Report using meaningful names when topology file is provided

Cluster utilities - ibdiagnet / ibdiagpath

Page 8: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 8

-i <dev-index> -p <port-num> • Device index (0..N) and port number connected to the network

-o <out-dir> • Directory to output the reports to

-lw <1x|4x|12x> -ls <2.5|5|10> • Link speed and width checked on every port on the network

-pm -pc • Perform error counters extensive check or clear counters respectively

-r • Extensive additional checks performed.

-P • Sets threshold for error levels. Also checks for errors of counters based on

absolute value of the error counter. When not using –P flag, error thresholds are only triggered based on how many errors were incremented DURING the ibdiagnet run.

-c • Packets to be sent on each link for error level checking

-h –V -v

• Help, Verbosity and Revision flags respectively

ibdiagnet - Optional flags

Page 9: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 9

Ibdiagnet is particularly useful in finding misconfigured links (speed/width, topology mismatches, and marginal link/cable issues.

Typical usage: • Clear all port counters using „ibdiagnet –pc‟ • Stress the cluster • Check cluster using „ibdiagnet –lw 4x –ls 5 –P all=1

- Checks for link speed, link width, and port error counters greater than 1

Ibdiagnet usage

Page 10: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 10

Reports a complete topology of cluster

Shows all interconnect connections reporting:

• Port LIDs

• Port GUIDs

• Host names

• Link Speed

GUID to name file can be used for more readable topology in

regards to switch devices

Cluster utilities - ibnetdiscover

Page 11: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 11

Simple usage is: ibnetdiscover –node-name-map <guid to name file>

Cluster utilities - ibnetdiscover

Page 12: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 12

SymbolErrors • Total number of minor link errors. Usually an 8b/10b error due to a bit error

Link Recovers • Total number of times the Port Training state machine has successfully completed the link error recovery

process.

LinkDowned • Total number of times the Port Training state machine has failed the link error recovery process and downed

the link.

RcvErrors • Total number of packets containing an error that were receive on the port. Usually due to a CRC error caused

by a bit error within the packet.

RcvSwRelayErrors • Total number of packets received on the port that were discarded because they could not be forwarded by the

switch relay. This counter should typically be ignored since Anafa-II has a bug that counts these when it gets a multicast packet on a port where that port also belongs to the multicast group of the packet.

XmtDiscards • Total number of outbound packets discarded by the port because the port is down or congested. Usually due

to the output port HOQ lifetime being exceeded.

VL15Dropped • Number of incoming VL15 packets dropped due to resource limitations (e.g., lack of buffers) in the port

XmtData,RcvData • Total number of 32-bit data words transmitted and received on the port.

XmtPkts,RcvPkts • Total number of data packets transmitted and received on the port.

Error counter review

Page 13: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 13

Run performance tests • /usr/bin/ib_write_bw

• /usr/bin/ib_write_lat

• /usr/bin/ib_read_bw

• /usr/bin/ib_read_lat

• /usr/bin/ib_send_bw

• /usr/bin/ib_send_lat

Usage

• Server: <test name> <options>

• Client: <test name> <options> <server IP address>

Performance tests

Note: Same options must be passed to both server and

client. Use –h for all options.

Page 14: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 14

UFM Unified Fabric Management

Page 15: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 15

Today‘s HPC Fabric Challenges

Undetected issues, unutilized fabric

Troubleshooting takes long

Separate systems

Unnoticed performance degradation

Application based class of service

Multitenancy - affecting each other

Size & complexity

Separate systems

Manual error prone change management

15

Operational

Performance

Troubleshooting

Page 16: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 16

UFM Essence

Provides Deep Visibility • Real-time and historical monitoring of fabric health and performance

• Central fabric dashboard

• Unique fabric-wide congestion map

Optimizes performance • Quality of Service

• Traffic Aware Routing Algorithm (TARA)

• Multicast routing optimization

Eliminates Complexity • One pane of glass to monitor and configure fabrics of thousand of nodes

• Enable advanced features like segmentation and QoS by automating provisioning

• Abstract the physical layer into user friendly entities such as jobs and resource groups

Maximizes Fabric Utilization • Threshold based alerts to quickly identify issues

• Performance optimization for maximum link utilization

• Master-standby HA architecture synchronized in real-time

16

Page 17: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 17

Open system

Extensible architecture based on Web-services • Open API for users or 3rd party extensions

• Expose entire fabric and datacenter object model

• API Documentation and example tools

Provides enhanced functionality in various

areas • Group/batch device management tasks

• Enhanced functionality (e.g. e-mail event notifications)

• Export information to external portals to view system

information

Integrated with Job Schedulers • Adaptive Computing: Moab

• Platform Computing: LSF

• Altair: PBS Pro

17

Page 18: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 18

Features Detailed Overview

18

Page 19: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 19

Dashboard Tab

Page 20: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 20

View Tab

Page 21: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 21

View Tab - Internal Structure & Properties

Internal structure

Properties

Common Tasks

Page 22: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 22

Manage Devices Tab

Lists all the physical hardware components for the selected site : • server, or switch.

The information is displayed in tabular form and includes the following Device information types : • State, ID, Name, IP address, Vendor, CPU type, RAM,

• Which Rack it belongs, the FW Version , Temperature

• Agent , Logical server it belongs to

Page 23: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 23

Design Window

Page 24: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 24

Advanced Monitoring and Analysis

Monitor & analyze fabric performance • B/W utilization

• Unique congestion monitoring

• Dashboard for aggregated fabric view

Real-time fabric-wide health monitoring • Monitor events and errors through-out the fabric

• Threshold based alarms

• Granular monitoring of host and switch parameters

Innovative congestion mapping • One view for fabric-wide congestion and traffic patterns

• Enables root cause analysis for routing, job placement

or resource allocation inefficiencies

24

Page 25: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 25

Unique Monitoring Engine

25

Sessions per Logical

Groups – no need to

know physical nodes

Multiple sessions

On demand

Correlate switch and host

information

Various graphs (linear,

bar, historgram, pie…)

Keep Historical Data

From 1 Min to 1 Month

Formulas (AVG, Max,

Min, Sum)

Page 26: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 26

UFM’s Unique Traffic & Congestion Map

Traffic pattern and overall fabric

condition

Identify multi to one scenarios

…Or Non-Optimized routing

…Or Slow receivers

… or Non-Optimized links…

Saves many hours of troubleshooting

26

Innovative b/w and congestion representation that provides fabric

health at a glance in a most effective way

Page 27: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 27

Granular Fabric Control

27

Active tables – sortable, searchable and ‘filterable’

Real-time monitoring of port health and performance counters

Alarms per device view

Automate device management tasks

QDR CCM Aggregation from switches

All connected to the logical model

Page 28: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 28

Event Management

28

Dozens of traffic and health events

Easy central drill-down to counters, alerts

and events to the port level

Configurable thresholds and criticality

levels

Alerts correlated to the application level

SNMP Traps to 3rd party systems

Script based action

Page 29: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 29

Performance Optimization Toolbox

29

Quality of Service

Application isolation

Collective offload RDMA messaging bus

Congestion Control

Traffic Aware Routing Algorithm

isolation

Page 30: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 30

Quality of Service Optimization

30

UFM Enables Isolation and QoS Optimizations

Page 31: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 31

Traffic Aware Routing Algorithm (TARA)

A unique new routing algorithm on top of OpenSM • TARA is optimizing the routing according to topology, jobs and traffic direction

TARA provides the following benefits • Reduces competition between fabric resources, thus decreasing congestion

• Increases available bandwidth, resulting in improved fabric utilization

• Delivers lower latency and shorter application runtime

Customer case • TARA improved performance up to 300% (up to 4 times more b/w available for the application).

• The average improvement achieved was

100%, (available bandwidth doubled on average)

• Improvement magnitude is factor of

traffic patterns and available links

31

TARA increased b/w available for the application 4 times

Page 32: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 32

UFM TARA Improves Fabric Utilization

32

NO UFM TARA UFM TARA is ON

Page 33: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 33

Integration with Job Schedulers

Automatic fabric provisioning per job

QoS and TARA performance

optimization

Job oriented monitoring and events

Supported Schedulers:

• Moab (Adaptive Computing)

• LSF (Platform Computing)

• PBSPro (Altair )

33

The first integrated solution that correlates fabric management and

workload management for dynamic data centers

Page 34: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 34

UFM in HPC Cluster

34

Workload Submitted in

Workload Manager

Matching workloads

Automatically Created in UFM

Application Level Monitoring

& Optimization Measurements

Fabric-wide Policy Pushed to Match

Application Requirements

Page 35: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 35

Scaling Out

Large clusters pose management

challenges

• Topology map is overloaded with devices and

become inefficient for fabric analysis

• Slow discovery and updates

Optimizations made:

• Load physical map only on demand

• User experience: sees correlation between his actions and GUI response time

• Display only switch connectivity when “switch”

tree is selected

• Shorter update time

• “Cleaner” map view, no unnecessary clutter

35

GUI map is populated

Only when pressing “play” button

“Only switches” view

Screenshot from a 4K node cluster in the US

Page 36: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 36

Summary: UFM Benefits

36

Simple and Automated

Lowers administration tasks

time from days to minutes

Increased Performance

Reduce congestion, lower latency

Quicker application runtime

Little Fabric Visibility

Unnoticed performance degradation

Difficult to assess impact

Low Performing Unutilized Fabrics

Arbitrary routing algorithms, QoS seldom implemented

Congested fabrics, latency affected

Complex and Manual Processes

Needs admin skills

Many options left unused at all

Ineffective Troubleshooting

Long troubleshooting time

Performance issues take days to analyze

Quick Issue Resolution

Dashboard, Alarms, Congestion Map

Reduces downtime, high fabric utilization

In-Depth Visibility and Control

Clear health and performance visualization

Business oriented impact and root analysis

Fabrics w/o UFM UFM Customers

Page 37: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 37

Hands On

Page 38: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES 38

InfiniBand diagnostics tools – Hands On

Set up

• 2 servers with ConnectX HCA running SLES 11

• 8 port QDR IB switch based on InfiniScale 4 switch silicon

Steps

• Check HCA state

• Review /sys/class/infiniband filesystem

• Inventory: ibswitches, ibhost

• Ibnetdiscover

• perfquery, ibportstate, smpquery

• Ibdiagnet

• Performance test

Page 39: InfiniBand diagnostics tools - HPC Advisory Council...- [root@ibd001 mft-2.5.0]# ibtracert -G 0x0002c90300001481 0x0002c90300001489 From ca {0x0002c90300001480} portnum 1 lid 12-12

© 2011 MELLANOX TECHNOLOGIES - MELLANOX CONFIDENTIAL - 39 39

Thank You www.mellanox.com