50
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. OpenVMS SDA Extensions and Use Cases BCS © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Rohit Prasad & Nagendra K V/ August 29, 2012 OpenVMS Engineering

Webinar 2012 08 OpenVMS SDA Extension

Embed Size (px)

Citation preview

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    1/50

    Copyright 2012 Hewlett-Packard Development Company, L.P.

    The information contained herein is subject to change without notice.

    OpenVMS SDA Extensions and

    Use CasesBCS

    Copyright 2012 Hewlett-Packard Development Company, L.P.

    The information contained herein is subject to change without notice.

    Rohit Prasad & Nagendra K V/ August 29, 2012

    OpenVMS Engineering

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    2/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.2

    What does this talk cover?

    Agenda

    Introduction to SDA Extensions What are they? How to start?

    How it works?

    How to write SDA Extension? Writing SDA Extension

    Debugging an SDA Extension

    Use Cases References

    Q & A

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    3/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    IntroductionOpenVMS SDA Extensions

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    4/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4

    What is SDA?

    SDA stands for System Dump Analyzer

    To activate SDA$ ANALYZE/CRASH_DUMP analyze system or process dump

    $ ANALYZE/SYSTEM analyze running system

    Provides lots of command to analyze dump/system, like

    display/format system internal data structures

    OpenVMS System Analysis Tool

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    5/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5

    SDA Extension: What are they? 1(2)

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    6/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6

    SDA Extension: What are they? 2(2)

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    7/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.7

    SDA Extension: How to start? 1(2)

    Issue SDA Extension command at the SDA>prompt

    Searches for a logical name in the format of whatever

    the command was plus $SDA.

    For Example SDA> PCS (PC sampling )

    SDA> FLT (Alignment Fault)

    SDA> IO (IO tracing)

    The SDA Extensions are searched in SYS$LIBRARYwhenever you issue an SDA command that is not a

    native SDA command.

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    8/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8

    SDA Extension: How to start? 2(2)

    Define SDA Extension logical$ DEFINE SDA_EXTENSION$SDA LOCATION:SDA_EXTENSION$SDA

    $ ANALYZE /SYSTEM Activate SDA

    Give SDA Extension commandsSDA> SDA_EXTENSION COMMAND1

    SDA> SDA_EXTENSION COMMAND2

    Generic Commands:

    SDA> xxx LOADSDA> xxx START TRACE

    SDA> xxx STOP TRACE

    SDA> xxx SHOW TRACE

    SDA> xxx UNLOAD

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    9/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9

    SDA Extension: How it works?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    10/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10

    SDA Extension: They are many

    The extensions supplied by OpenVMS can be identified by issuing the follo

    command $DIRECTORY SYS$LIBRARY:*SDA.EXE

    Network

    LAN

    NET

    TCPIP

    Cluster

    LCK

    CNX

    ICC

    SHAD

    PE Driver

    File System

    XFC

    RMS

    CLUE XQP

    SHOW RMS

    DKLOG

    OIO

    IO

    PKM

    FC

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    11/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    How to write SDA Extension?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    12/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12

    SDA Extension: Write new SDA Extension 1(3

    1) #include statements for DESCRIP.H

    and SDA_ROUTINES.H

    2) The global variable SDA$

    initialized (int sda$extend_

    SDA_FLAGS$K_VERSION;)

    3) The routine

    SDA$EXTEND which is

    an entry point

    4) The declaration of SDAvoid sda$extend(

    int *transfer_table

    struct dsc$descrip

    SDA_FLAGS sda_flag

    5) The first executable

    statement of the routine

    must be to copy

    TRANSFER_TABLEto SDA$VECTOR_TABLE:

    sda$vector_table = transfer_table;

    6) The next statement

    establish a condition h

    lib$establish (sda$co

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    13/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13

    SDA Extension: Write new SDA Extension 2(3

    Minimal extension would be:

    #include

    #include

    int sda$extend_version = SDA_FLAGS$K_VERSION;

    void sda$extend ( int *transfer_table,struct dsc$descriptor_s *cmd_line,SDA_FLAGS sda_flags)

    {sda$vector_table = transfer_table;

    lib$establish (sda$cond_handler);

    sda$print ("hello, world");return;

    }

    For more information on SDA extension callable routines, refer HP OpenVMS

    Tools Manual Chapter 10.

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    14/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14

    SDA Extension: Write new SDA Extension 3(3

    Compiling an SDA Extension

    $cc sda_extension$sda + sys$library:sys$lib_c /library

    $link /share -

    sda_extension$sda.obj, -

    sys$library:vms$volatile_private_interfaces /library, -sys$input /option

    symbol_vector = (sda$extend=procedure)

    symbol_vector = (sda$extend_version=data)

    Linking an SDA Extension

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    15/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15

    SDA Extension: Debugging an Extension 1(3

    Using SDA debug image (SDA_DEBUG.EXE) via a logical one can debug

    SDA extension as follows:

    Compile extension with /DEBUG/NOOPTand link it /DEBUG

    $cc /debug /noopt sda_extension$sda + sys$library:sys$lib_c /library

    $ link /debug /share sda_extension$sda.obj, -

    sys$library:vms$volatile_private_interfaces /library, -

    sys$input /option

    symbol_vector = (sda$extend=procedure)symbol_vector = (sda$extend_version=data)

    Define logical names for SDAand the extension, and invoke SDA

    $ define sda sda_debug

    $ define sda_extension$sda sda_extension$sda

    $ analyze /system

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    16/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16

    SDA Extension: Debugging an Extension 2(3

    Type SET BREAK START_EXTENSION at the DBG> prompt, and then

    type GO.DBG> set break start_extension

    DBG> go

    Invokethe extension at the SDA> prompt.SDA> sda_extension

    break at routine START\START_EXTENSION

    When Debug prompts again, use Debug commands to set breakpointson, in the extension and then type GO.

    DBG> set image sda_extension $sda

    DBG> set language c

    DBG> set break /exception

    DBG> go

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    17/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17

    SDA Extension: Debugging an Extension 3(3

    Invoke the extension, providing the necessary arguments.

    SDA> sda_extension command_1...

    SDA> sda_extension command_2

    ...

    %DEBUG-I-DYNMODSET, setting module sda_extension$SDA

    %SYSTEM-E-INVARG, invalid argument

    ...

    DBG>

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    18/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Use CasesWhen and where to use SDA Extension?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    19/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Use Case 1: DKLOG (Disk Log) Traci

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    20/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20

    What is DKLOGTracing? Displays Logical IOs to SCSI disk Can be used to:

    Monitor IOs done by any component which issues logical IO.

    Get the LBN of file on which IOs are made.

    Disk log SDA Extension

    DKLOG Output:SDA> DKLOG SHOW DKA0DKDRIVER I/O Logging--------------------

    Device Name UCB Addr Path name Entries Oldest Entry---------------------------------------------------------------------------$64$DKA0 8821EF80 PKA0.0 128 11:44:23.72

    Time Log Point I/O Function VMS Status SCSI Status < Log Point Specific Data ----------------------------------------------------------------------------------------------------11:44:23.88 COMPLETE_IO WRITEPBLK NORMAL IRP 890A9A00 PID 8E3D4770 CBUSY 0011:44:23.88 CMD_ENDED GOOD READ_10 LBA 0446f98f LTH11:44:23.88 CMD_BEGUN Initialized READ_10 LBA 0446f98f LTH11:44:23.88 CMD_ENDED GOOD WRITE_10 LBA 0446f98f LTH

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    21/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.21

    Problem Statement : A cluster node lost connection to quorum disk, w

    loss of cluster quorum

    IO to Quorum file failed.

    Quorum lost for brief amount of time.

    No crash dump.

    DKLOG could help to analyze the problem.

    Use Case 1Loss of cluster quorum

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    22/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.22

    $ DUMP QUORUM.DAT /FILE_HEADER

    Identification areaFile name: QUORUM.DAT;1Revision number: 1Creation date: 22-JAN-2006 03:58:45.57Revision date: 22-JAN-2006 03:58:45.58Expiration date: Backup date: 19-FEB-2008 19:23:24.76

    Map areaRetrieval pointers

    Count: 512 LBN: 2101248 eval ^d2101248Hex = 00000000.00201000 Decimal = 2101248

    Use Case 1Loss of cluster quorum

    Get LBN of QUORUM.DAT to map entries in DKLOG traces.

    Use Case 1

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    23/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.23

    Device Name UCB Addr Path name Entries Oldest Entry---------------------------------------------------------------------------$1$DGA4401 81F07A40 PGA0.5005-0768-0130-3B3E 128 10:29:50.56

    Time Log Point I/O Function VMS Status SCSI Status < Log Poin-------------------------------------------------------------------------------------------------------

    10:29:50.56 COMPLETE_IO WRITEPBLK NORMAL IRP 81C96580 PID 819810:29:50.56 CMD_ENDED GOOD WRITE_10 LBA10:29:50.56 CMD_BEGUN Initialized WRITE_10 LBA10:29:50.56 KP_STARTIO WRITEPBLK IRP 81C96580 PID 81910:30:05.56 COMPLETE_IO READPBLK MEDOFL IRP 81C96580 PID 819810:30:05.56 SENSE_DATA Sense Key UNIT ATTEN10:30:05.56 CMD_ENDED * CHK COND * READ_10 LBA10:30:05.56 CMD_BEGUN Initialized READ_10 LBA10:30:05.56 KP_STARTIO READPBLK IRP 81C96580 PID 819

    10:30:05.56 KP_STARTIO PACKACK IRP 8219DD00 PID 81910:30:05.56 STARTIO PACKACK IRP 8219DD00 PID 81910:30:05.56 COMPLETE_IO PATH_VERIFY DEVCON IRP 8219DD00 PID 819810:30:05.56 CMD_ENDED GOOD MAINT IN 0a010:30:05.56 CMD_BEGUN Initialized MAINT IN 0a010:30:05.56 CMD_ENDED GOOD RD_DEVID 00010:30:14.56 COMPLETE_IO READPBLK NORMAL IRP 81C96580 PID 81910:30:14.56 CMD_ENDED GOOD READ_10 LBA10:30:14.56 CMD_BEGUN Initialized READ_10 LBA10:30:14.56 KP_STARTIO READPBLK IRP 81C96580 PID 819

    Use Case 1Loss of cluster quorum

    Snippet of DKLOG Traces

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    24/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24

    Case Resolution:

    The system manager was directed to their storage vendor to furt

    investigate the unit attention of 3F/0E

    Use Case 1Loss of cluster quorum

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    25/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Use Case 2: SPL (Spinlock)Tracing

    What is SPL Tracing?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    26/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.26

    Spinlock Tracing SDA Extension

    What is SPL Tracing?

    Spinlock tracing will allow you to trace spinlocks acquisition, release, andtime.

    Can be used to:Determine Spinlock Contention.

    Determine reason for High MPSync.

    Tracing CPUSPINWAITbugchecks.

    The SYS$EXAMPLES:SPL.COMcan be used to gather useful spinlock inform

    What is SPL Tracing?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    27/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.27

    Spinlock Tracing SDA Extension

    What is SPL Tracing?

    Spinlock Tracing Utility Output:SDA> SPL SHOW TRACE

    Spinlock Trace Information:---------------------------Timestamp CPU Spin/Forklock/IPL Caller's/Fork PC EPID Operation T---------------------- --- --------------------- -------------------------------------- -------- ----------------- -14-MAY 15:26:54.339033 07 A81B1600 MMG 805E6710 MMG$PAGEFAULT_C+00B90 31604028 Release F14-MAY 15:26:54.339024 07 A81B1600 MMG 805E5D10 MMG$PAGEFAULT_C+00190 31604028 Acquire F14-MAY 15:26:54.339004 00 A81B0D00 HWCLK 8012AA60 SYSTEM_PRIMITIVES_MIN+0011AA6 00000000 Release F14-MAY 15:26:54.339004 00 A81B0D00 HWCLK 8012A700 SYSTEM_PRIMITIVES_MIN+0011A70 00000000 Acqnoipl F

    SDA> SPL SHOW TRACE /SUMMARY

    Spinlock Trace Information: (at 14-MAY-2012 15:27:47.69, trace time 00:00:49.535768)------------------------------------------------------------------------------------

    Events Acquires Releases Acq Own Acq NoSpin Spinwaits %Spinlock /sec /sec /sec /sec /sec /sec Spinwait---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------HWCLK 2000.0 1000.0 1000.0 0.0 0.0 0.0 0.0SCHED 87.0 43.0 43.0 0.0 0.0 1.1 2.5MMG 18.3 4.0 8.9 0.0 5.0 0.4 4.8---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------

    4844.6 2372.9 2409.9 30.1 7.6 24.1 1.0

    U C 2

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    28/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.28

    Problem Statement : 32 Core system crashed with CPUSPINWAIT bugc

    CPUSPINWAIT bugcheck because of SCHEDspinlock.

    CPU is going through the scheduler idle loop at the time of crash.

    Idle loop acquires SCHED spinlock for a very brief amount of time.

    Scheduler might be a victim.

    SPL traces should help in finding the root cause.

    Use Case 2System crashed with CPUSPINWAIT

    SDA> CLUE CRASHCPUSPINWAIT Bugcheck:Cause: timeout acquiring spinlockSpinlock name: SCHEDSpinlock address: D81B1100Spinlock owner CPU Id: 00Crash CPU Id: 1D

    Use Case 2

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    29/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.29

    SDA> SPL SHOW TRACE /NOFORK /SPINLOCK=SCHED

    Spinlock Trace Information:---------------------------Timestamp CPU Spin/Forklock/IPL Caller's/Fork PC EPID Operatio---------------------- --- ------------------ -------------------------------------- -------- ---------1-JUL 00:16:16.857577 1F D81B1100 SCHED 80509AB0 SCH$QEND_C+00520 FFFFFFFF Acquire 1-JUL 00:16:15.596578 1E D81B1100 SCHED 80509AB0 SCH$QEND_C+00520 FFFFFFFF Acquire

    1-JUL 00:16:14.715576 1D D81B1100 SCHED 80509AB0 SCH$QEND_C+00520 FFFFFFFF Acquire 1-JUL 00:16:14.559678 00 D81B1100 SCHED 80553190 SYS$HIBER_C+00A00 25800818 Acqnoipl 1-JUL 00:16:14.559665 00 D81B1100 SCHED 804B7870 SCH$RESCHED_FOR_WAIT_C+005C0 25800818 Release 1-JUL 00:16:14.559645 00 D81B1100 SCHED 80553190 SYS$HIBER_C+00A00 2580080C Acqnoipl 1-JUL 00:16:14.559615 00 D81B1100 SCHED 804B7870 SCH$RESCHED_FOR_WAIT_C+005C0 2580080C Release 1-JUL 00:16:14.559591 00 D81B1100 SCHED 804B7420 SCH$RESCHED_FOR_WAIT_C+00170 00000000 Acquire

    Use Case 2Systemcrashed with CPUSPINWAITSPL trace will list the CPUs which acquires/release the spinlock

    U C 2

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    30/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.30

    Case Resolution:

    Scheduler idle loop is the culprit

    Added more traces in the scheduler idle loop to find the root caus

    Scheduler was cycling between the two processes on the priority

    Bug in scheduler code which resulted in the loop was fixed.

    Use Case 2System crashed with CPUSPINWAIT

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    31/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Use Case 3: Fault Tracing (FLT)

    What is FLT Tracing?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    32/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.32

    What is FLT Tracing?

    FLT SDA extension helps collect alignment fault traces. When you detect a high alignment fault rate, you might be interested in id

    The process causing the fault.

    The mode of fault.

    The routine causing the fault.

    The unaligned address.

    AlignmentFaultTracingSDAExtension

    What is FLT Tracing?

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    33/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.33

    What is FLT Tracing?AlignmentFaultTracingSDAExtensionSDA FLT Extension Output:SDA> FLT SHOW TR

    Unaligned Data Fault Trace Information:---------------------------------------Timestamp CPU Exception PC Unaligned VA AccessTrace Buffer---------------------- --- ----------------------------------------------- ----------------- ------11-MAY 12:01:01.206064 1B FFFFFFFF.8049D3B1 IOC$TRANDEVNAM_C+006D1 00000000.7FF43EC4 Kern FFFFFFFF.702EC2F8

    IOSUBPAGD + 000013B1 / IOC$TRANDEVNAM + 000006D111-MAY 12:01:01.206062 1B FFFFFFFF.8049D351 IOC$TRANDEVNAM_C+00671 00000000.7FF43EBC Kern FFFFFFFF.702EC290

    SDA> FLT SHOW TRACE /SUMMARY

    Fault Trace Information: (at 11-MAY-2012 12:01:22.46, trace time 00:00:02.702048)---------------------------------------------------------------------------------Exception PC Rate Exception PC Module Offset --------------------------------------------------------------------------------------------FFFFFFFF.80422D70 0.37 IOC_STD$CREATE_UCB_C+00A20 IO_ROUTINES 00055970

    SYSASSIGN + 00000930 / EXE$ASSIGN_LOCAL + 000006E0FFFFFFFF.80422D81 0.37 IOC_STD$CREATE_UCB_C+00A31 IO_ROUTINES 00055981

    SYSASSIGN + 00000941 / EXE$ASSIGN_LOCAL + 000006F1

    Use Case 3

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    34/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.34

    Problem Statement : High MPSync in SMP system resulting in perform

    degradation.

    Lock thrashing (busy wait) results in high MPSync

    High contention for spinlock.

    Can also be because of alignment faults.

    Combination of SPL and FLT traces will help.

    Use Case 3High MP-Sync on a SMP system

    Use Case 3

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    35/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.35

    Use Case 3High MP-Sync on a SMP system

    $MONITOR MODESOpenVMS Monitor Utility

    +-----+ TIME IN PROCESSOR MODES| CUR | on node NODE_XX+-----+ 11-MAY-2012 15:22:37.34

    Combined for 32 CPUs 0 800 1600 2400 3200+ - - - - + - - - - + - - - - + - - - - +

    Interrupt State 2 || | | | |

    MP Synchronization 720 |########| | | | |

    Kernel Mode 118 |#| | | | |

    Executive Mode || | | | |

    Supervisor Mode || | | | |

    User Mode || | | | |

    NOT AVAILABLE || | | | |

    Idle Time |############################+ - - - - + - - - - + - - - - + - - - - +

    Why performance degradation?

    2360

    Use Case 3

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    36/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.36

    SDA> SPL SHOW TRACE /SUMMARY

    Spinlock Trace Information: (at 11-MAY-2012 15:23:43.00, trace time 00:00:01.504909)------------------------------------------------------------------------------------

    Events Acquires Releases Acq Own Acq NoSpin Spinwaits Spinlock /sec /sec /sec /sec /sec /sec Spinw---------- ---------- ---------- ---------- ---------- ---------- ---------- --------

    MEGA 2.7 1.3 1.3 0.0 0.0 0.0 HWCLK 1997.5 998.7 998.7 0.0 0.0 0.0 SCHED 1441.9 717.7 720.3 0.0 2.7 1.3

    MMG 66438.0 22532.3 22512.3 0.0 0.0 22393.4 2TIMER 1271.8 635.9 635.9 0.0 0.0 0.0 IOLOCK8 215.3 99.0 107.6 8.6 0.0 0.0 LCKMGR 5.3 2.7 2.7 0.0 0.0 0.0 FILSYS 2.7 1.3 1.3 0.0 0.0 0.0 QUEUEAST 10.6 5.3 5.3 0.0 0.0 0.0 Dynamic 2127.7 975.5 1063.2 45.8 43.2 0.0 ---------- ---------- ---------- ---------- ---------- ---------- ---------- --------

    73513.5 25969.7 26048.6 54.4 45.9 22394.7 2

    Use Case 3High MP-Sync on a SMP system

    Who is causing high MPSync?

    Use Case 3

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    37/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.37

    $ MONITOR ALIGNMENT_FAULTS

    OpenVMS Monitor UtilityALIGNMENT FAULT STATISTICS

    on node NODE_XX11-MAY-2012 15:25:01.79

    CUR AVE MIN MAXKernel Fault Rate 1.33 1.33 1.33 1.33

    Exec Fault Rate 0.00 0.00 0.00 0.00Super Fault Rate 0.00 0.00 0.00 0.00User Fault Rate 117689.66 117689.66 117689.66 117689.66Total Fault Rate 117690.66 117690.66 117690.66 117690.66

    Use Case 3High MP-Sync on a SMP system

    Is this because of alignment fault?

    Use Case 3

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    38/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.38

    SDA> FLT SHOW TRACE /SUMMARY

    Fault Trace Information: (at 11-MAY-2012 15:27:02.61, trace time 00:00:02.828362)---------------------------------------------------------------------------------Exception PC Rate Exception PC Module O----------------- ------------- -------------------------------------- ---------------------------FFFFFFFF.85E4C441 31133.77 LIBRTL+0018A441 LIBRTL FFFFFFFF.85E4C441 30133.77 LIBRTL+0018A401 LIBRTL FFFFFFFF.841A3341 28332.90 LIBRTL+00187341 LIBRTL FFFFFFFF.85E4C0D0 3534.39 LIBRTL+0018A0D0 LIBRTL FFFFFFFF.841A3050 3219.12 LIBRTL+00187050 LIBRTL FFFFFFFF.85E4C0D0 2331.39 LIBRTL+0018A0D0 LIBRTL FFFFFFFF.841A3050 1265.12 LIBRTL+00187050 LIBRTL FFFFFFFF.85E4C0D0 600.39 LIBRTL+0018A0D0 LIBRTL FFFFFFFF.841A3050 600.12 LIBRTL+00187050 LIBRTL

    Use Case 3High MP-Sync on a SMP system

    Use Case 3

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    39/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.39

    Case Resolution:

    Multiple options to resolve this issue,

    Align the data.

    Hint to the compiler that the data about to be accessed is un

    o __unaligned (C)

    o .set_registers unaligned= (Macro)

    o align(x) (Bliss32/Bliss64)

    o aligned(x) (Pascal)

    Copy the data to an aligned buffer.

    Use Case 3High MP-Sync on a SMP system

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    40/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Use Case 4: MUTEX (MTX) Tracing

    MTX Tracing

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    41/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.41

    MTX Tracing

    Mutex tracing allows you to track mutex locks and unlocks.Can be used to:

    Determine heavily used mutexes.

    Determine mutex contention.

    Tracing MTXCNTNONZbugchecks.

    Mutex Tracing Utility MTX Output:

    Mutex Tracing SDA Extension

    SDA> MTX SHOW TRACE

    Mutex Trace Information:------------------------Timestamp CPU Mutex Callers PC EPID Opera---------------------- --- ----------------- --------------------------------------- -------- --------14-MAY 15:17:21.982068 00 A801E800 IODB FFFFFFFF.81430240 TCPIP$TNDRIVER+23D40 FFFFFFFF Unlock 14-MAY 15:17:21.982067 00 A801E800 IODB FFFFFFFF.814300F0 TCPIP$TNDRIVER+23BF0 FFFFFFFF Lock Re14-MAY 15:17:21.635036 0A A801E800 IODB FFFFFFFF.804235B0 IO_ROUTINES+561B0 3160400D Unlock

    Use Case 4

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    42/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.42

    Problem Statement: System Hung / Forced Crash Due To Processes in A M

    This forced crash was the third in a series of intermittent forced crashes

    Many processes in MUTEX state waiting on logical name database mute

    Analyzing dump does not gives any clue.

    Mutex tracing should help in analyzing the crash.

    System Hang

    Use Case 4

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    43/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.43

    SDA> show summary

    Current process summary-----------------------

    Extended Indx Process name Username State Pri PCB/KTB PHD Wkset-- PID -- ---- --------------- ------------ ------- --- -------- -------- ------24400411 0011 AUDIT_SERVER AUDIT$SERVER HIB 10 8340ABC0 8702C000 17324400412 0012 JOB_CONTROL SYSTEM MUTEX 10 8340D440 87036000 12024400414 0014 SECURITY_SERVER SYSTEM MUTEX 10 8340FD40 87040000 48124400421 0021 MONITOR_SERVER SYSTEM HIB 15 8349A700 8705E000 12624400422 0022 NETACP DECNET MUTEX 10 83470080 86FF0000 12124400423 0023 EVL DECNET HIB 6 83498F40 87068000 16124400424 0024 REMACP SYSTEM MUTEX 10 83499F00 87072000 3424400425 0025 MULTINET_SERVER SYSTEM MUTEX 6 834A7CC0 8707C000 352

    24400426 0026 SYMBIONT_76 SYSTEM HIB 6 834F22C0 87086000 2824400427 0027 NAMED_SERVER SYSTEM MUTEX 6 834FF780 87090000 338224400428 0028 NTP_SERVER SYSTEM MUTEX 31 8350BFC0 8709A000 34324400429 0029 SSHD Master SYSTEM MUTEX 6 8350C580 870A4000 3852440042B 002B NFS_SERVER SYSTEM MUTEX 6 8351CD00 870B8000 2672440042C 002C SYMBIONT_77 SYSTEM HIB 6 83528E00 870C2000 282440042D 002D NFS_SERVERIO_1 SYSTEM LEF 4 835343C0 870CC000 1122440362E 022E FFTB_DBS SLCSHR MUTEX 16 835CE640 873EC000 13662440042F 002F SYMBIONT_79 SYSTEM HIB 6 8353A4C0 870E0000 2824400430 0030 DFG$MCC SYSTEM LEF 6 8353CC40 870EA000 319

    System Hang

    Use Case 4

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    44/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.44

    SDA> show process / index = 14Process index: 0014 Name: SECURITY_SERVER Extended PID: 24400414--------------------------------------------------------------------.State MUTEX Flags 00000000Base priority 8 Current priority 10

    Waiting EF cluster 4 Event flag wait mask 8280E0C0CPU since last quantum 000000C7 Mutex count 0

    ASTs active NONE

    System Hang

    SDA> exam 8280E0C0LNM$AQ_MUTEX: 00000000.00000001 "........" exam lnm$aq_mutex

    LNM$AQ_MUTEX: FFFFFFFF.00000000 "........"

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    45/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.45

    SDA> MTX SHOW TRACE

    Mutex Trace Information:------------------------Timestamp CPU Mutex Callers PC EPID Operation---------------------- --- ------------- ------------------------------- -------- ---------------------28-AUG 08:09:31.326136 00 8280E0C0 LNM 8022C10C LOGICAL_NAMES+0610C 24404592 Lock Write Quad (wait28-AUG 08:09:31.325264 03 8280E0C0 LNM 8022C10C LOGICAL_NAMES+0610C 24403A74 Lock Write Quad (wait28-AUG 08:09:31.325027 03 8280E0C0 LNM 8022C10C LOGICAL_NAMES+0610C 24403E70 Lock Write Quad (wait...28-AUG 07:45:13.660793 03 8280E0C0 LNM 8022C10C LOGICAL_NAMES+0610C 24403E8F Lock Write Quad (wait28-AUG 07:45:13.430261 01 8280E0C0 LNM 8022EE2C LOGICAL_NAMES+08E2C 24403D30 Lock Read Quad (wait

    28-AUG 07:45:13.309330 00 8280E0C0 LNM 80227174 LNM$SEARCH_ONE_C+00074 2440362E Lock Read Quad (wait)28-AUG 07:45:13.309044 03 8280E0C0 LNM 8022C10C LOGICAL_NAMES+0610C 244012CE Lock Write Quad (wait...28-AUG 07:43:29.626096 02 8280E0C0 LNM 8022F4B4 LOGICAL_NAMES+094B4 24401222 Unlock Quad28-AUG 07:43:29.626094 02 8280E0C0 LNM 8022EE2C LOGICAL_NAMES+08E2C 24401222 Lock Read Quad28-AUG 07:43:29.626020 02 8280E0C0 LNM 8022F4B4 LOGICAL_NAMES+094B4 24401222 Unlock Quad

    System Hang

    Use Case 4

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    46/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.46

    Case Resolution:

    Logical name database mutex is not owned.

    No evidence of software bug.

    In one of the crashes CPU 3 didnt respond to bugcheck request.

    The customer was asked to replace CPU 03, which resolved the pr

    Use CaseSystem Hang

    References

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    47/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.47

    Tracing Tools on OpenVMShttp://h71000.www7.hp.com/openvms/products/t4/openvms_tracing_tools.pd

    HP OpenVMS System Analysis Tools Manualhttp://h71000.www7.hp.com/doc/82final/6549/6549pro.html

    Questions/Comments

    http://h71000.www7.hp.com/openvms/products/t4/openvms_tracing_tools.pdfhttp://h71000.www7.hp.com/doc/82final/6549/6549pro.htmlhttp://h71000.www7.hp.com/doc/82final/6549/6549pro.htmlhttp://h71000.www7.hp.com/openvms/products/t4/openvms_tracing_tools.pdfhttp://h71000.www7.hp.com/openvms/products/t4/openvms_tracing_tools.pdf
  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    48/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.48

    Contact Office of Customer [email protected]

    mailto:[email protected]:[email protected]
  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    49/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Q & A

  • 5/22/2018 Webinar 2012 08 OpenVMS SDA Extension

    50/50

    Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

    Thank you