24
================================================================================ SunFire 15/25K domains -------------------------------------------------------------------------------- set ssh escape when coming to avoid going back to your original login window %ssh -e ^v^o sun-sc-server-name #sets control-o as excape #(a control-v is done first to #go into VI edit mode in /etc/inetd.conf for DR activities pkill -1 inetd ---- #entries for Sun DR sun-dr stream tcp wait root /usr/lib/dcs dcs sun-dr stream tcp6 wait root /usr/lib/dcs dcs ---- console logs on SC: /var/opt/SUNWSMS/adm/$DOMAIN-LETTER /var/opt/SUNWSMS/adm/platform failover SC /opt/SUNWSMS/SMS1.5/bin/failover force restart sms /opt/SUNWSMS/SMS1.5/bin/failover stop /etc/init.d/sms stop /etc/init.d/sms start /opt/SUNWSMS/SMS1.5/bin/failover start reconfig with smsconfig: setfailover off /etc/init.d/sms stop smsconfig -m /etc/init.d/sms start setfailover on setfailover force #must reboot each SC when this is completed #may have to change /etc/hostname.dman0 on each domain & reconfig that NIC #repeat on other SC delete a board: cfgadm -av (determine system names and attachment names) cfgadm -c disconnect Ap_Id cfgadm -x unassign Ap_Id (powers the board off, amber light will come on board) add a board: cfgadm -av (determine system names and attachment names) cfgadm -c assign Ap_Id cfgadm -c configure Ap_Id see if a board cannot be moved (kernel)- 'cfgadm -alv | grep for permanent' You can check the system board status via: cfgadm -av Ap_Id Can check number of cpus and memory via "psrinfo" and "prtconf | head -5" commands from each OS instance. Can see all cpu boards on a 6800 via "showboards -p cpu" , can see all cpus in a dom somehost:SC> showboards -d a -p cpu Component Description --------- ----------- /N0/SB0/P0 UltraSPARC-III, 750MHz, 8M ECache /N0/SB0/P1 UltraSPARC-III, 750MHz, 8M ECache /N0/SB0/P2 UltraSPARC-III, 750MHz, 8M ECache /N0/SB0/P3 UltraSPARC-III, 750MHz, 8M ECache /N0/SB2/P0 UltraSPARC-III, 750MHz, 8M ECache /N0/SB2/P1 UltraSPARC-III, 750MHz, 8M ECache Page 1 of 24 2/15/2012 http://www.4bcad.com/doc/misc/solaris_domains

SunFire 15-25K Domains

Embed Size (px)

Citation preview

Page 1: SunFire 15-25K Domains

================================================================================

SunFire 15/25K domains

--------------------------------------------------------------------------------

set ssh escape when coming to avoid going back to your original login window

%ssh -e ^v^o sun-sc-server-name #sets control-o as excape

#(a control-v is done first to

#go into VI edit mode

in /etc/inetd.conf for DR activities

pkill -1 inetd

----

#entries for Sun DR

sun-dr stream tcp wait root /usr/lib/dcs dcs

sun-dr stream tcp6 wait root /usr/lib/dcs dcs

----

console logs on SC: /var/opt/SUNWSMS/adm/$DOMAIN-LETTER

/var/opt/SUNWSMS/adm/platform

failover SC

/opt/SUNWSMS/SMS1.5/bin/failover force

restart sms

/opt/SUNWSMS/SMS1.5/bin/failover stop

/etc/init.d/sms stop

/etc/init.d/sms start

/opt/SUNWSMS/SMS1.5/bin/failover start

reconfig with smsconfig:

setfailover off

/etc/init.d/sms stop

smsconfig -m

/etc/init.d/sms start

setfailover on

setfailover force

#must reboot each SC when this is completed

#may have to change /etc/hostname.dman0 on each domain & reconfig that NIC

#repeat on other SC

delete a board:

cfgadm -av (determine system names and attachment names)

cfgadm -c disconnect Ap_Id

cfgadm -x unassign Ap_Id (powers the board off, amber light will come on board)

add a board:

cfgadm -av (determine system names and attachment names)

cfgadm -c assign Ap_Id

cfgadm -c configure Ap_Id

see if a board cannot be moved (kernel)- 'cfgadm -alv | grep for permanent'

You can check the system board status via:

cfgadm -av Ap_Id

Can check number of cpus and memory via "psrinfo" and "prtconf | head -5" commands from each OS instance. Can see all cpu boards on a 6800 via "showboards -p cpu" , can see all cpus in a dom

somehost:SC> showboards -d a -p cpu

Component Description

--------- -----------

/N0/SB0/P0 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB0/P1 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB0/P2 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB0/P3 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB2/P0 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB2/P1 UltraSPARC-III, 750MHz, 8M ECache

Page 1 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 2: SunFire 15-25K Domains

/N0/SB2/P2 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB2/P3 UltraSPARC-III, 750MHz, 8M ECache

somehost:SC> showboards -d c -p cpu

Component Description

--------- -----------

/N0/SB1/P0 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB1/P1 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB1/P2 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB1/P3 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB3/P0 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB3/P1 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB3/P2 UltraSPARC-III, 750MHz, 8M ECache

/N0/SB3/P3 UltraSPARC-III, 750MHz, 8M ECache

somehost:SC> showboards -d a -p memory

Component Size Reason

--------- ---- ------

/N0/SB0 4096 MB

/N0/SB2 4096 MB

somehost:SC> showboards -d c -p memory

Component Size Reason

--------- ---- ------

/N0/SB1 4096 MB

/N0/SB3 4096 MB

showplatform

showboards -v -d A

showboards -d A

moveboard -d (todomain) -r 2 -t 300 SB#

deleteboard -r 2 -t 300 SB#

addboard -d (todomain) -r 2 -t 300 SB#

--------------------------------------------------------------------------------

poweron EX0 (May already be on)

poweron SB0

flashupdate -f /opt/SUNWSMS/hostobjs/sgcpu.flash SB0

You can run multiple flashes at the same time from multi windows to reduce

the amount of time it take( avg. 7Min/board ) - but they will all load the

system controller - the SC is the choke point - do not do too many at once.

As system boards become available the domains need to be reconfigured as

follows

server1: SB0 SB1 SB2 SB3

server2: SB4 SB5 SB6 SB7 SB8 SB9 SB10

server3: SB11 SB12 SB13 SB14 SB15 SB16 SB17

Reconfigure server:

As the boards are replaced reconfigure server to only have 4 SB and bring

that domain up first.

On the SC run.

deleteboard SB4 SB5 SB6 SB7 SB8 SB9 SB10 SB11 SB12 SB13 SB14 SB15 SB16 SB17

This will leave SB0 SB1 SB2 and SB3 in the server domain.

Then:

cd ~/extended Note: This directory has a .postrc file set to level 32.

setkeyswitch -d server on

This will run a level 32 POST on the boards and boot the domain.

Page 2 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 3: SunFire 15-25K Domains

After system comes up it can be turned over to the DBA's

Reconfigure server:

On the SC run the following command after the SB have been Flashed.

addboard -d server -y -c configure SB4 SB5 SB6 SB7 SB8 SB9 SB10

cd ~/extended

setkeyswitch -d server on

Reconfigure server:

On the SC run the following command after the SB have been Flashed.

addboard -d server -y -c configure SB11 SB12 SB13 SB14 SB15 SB16 SB17

cd ~/extended

setkeyswitch -d server on

Other info needed:

login as root and then:

su - sms-svc

To get to a console of a domain.

console -d server

To shutdown a domain:

from the domain:

init 0

Then from the SC:

setkeyswitch -d server off

To power a single board on

poweron SB0

Note: you may have to power the extender board on before the SB.( poweron EX0)

================================================================================

----------------------------------------------------------------------

MAN Configuration Files

----------------------------------------------------------------------

/etc/opt/SUNWSMS/config/MAN.cf

-This is the MAN configuration file.

-Located on both SCs.

-Read by mand when entering MAIN SC mode.

-This file contains all the IP addresses for the SC, domain, and external

community interfaces.

-This file is created by smsconfig -m during the MAN Network

configuration stage at install.

-This file is unique to each SC and should not be copied from one SC to

another.

DO NOT edit this file by hand. Use smsconfig to change the

configuration. See the smsconfig MAN page for details.

Example (from a CP1500 contained SC called, "sc0" on platform called "15k":

1.

Page 3 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 4: SunFire 15-25K Domains

/etc/opt/SUNWSMS/config/MAN.cf

#

15k

C SC-FLOATER-C1 sc 192.68.66.172

C SC-TEST-hme0-C1 sc0-hme0

C SC-TEST-eri1-C1 sc0-eri1

I1 NM-I1 netmask-i1 255.255.255.224

I1 SC-I1 15k-sc-i1 13.1.1.1

I1 DA-I1 15k-a 13.1.1.2

I1 DB-I1 15k-b 13.1.1.3

I1 DC-I1 15k-c 13.1.1.4

I1 DD-I1 15k-d 13.1.1.5

I1 DE-I1 15k-e 13.1.1.6

I1 DF-I1 15k-f 13.1.1.7

I1 DG-I1 15k-g 13.1.1.8

I1 DH-I1 15k-h 13.1.1.9

I1 DI-I1 15k-i 13.1.1.10

I1 DJ-I1 15k-j 13.1.1.11

I1 DK-I1 15k-k 13.1.1.12

I1 DL-I1 15k-l 13.1.1.13

I1 DM-I1 15k-m 13.1.1.14

I1 DN-I1 15k-n 13.1.1.15

I1 DO-I1 15k-o 13.1.1.16

I1 DP-I1 15k-p 13.1.1.17

I1 DQ-I1 15k-q 13.1.1.18

I1 DR-I1 15k-r 13.1.1.19

I2 NM-I2 netmask-i2 255.255.255.252

I2 SC0-I2 sc0-i2 10.2.1.1

I2 SC1-I2 sc1-i2 10.2.1.2

/var/opt/SUNWSMS/data/<domain>/idprom.image

-This file contains the ethernet address of the domain.

-Located on the SCs.

-This file is referenced by mand when creating the IOSRAM handoff

structure (end of HPOST).

-To view this file, you must run sysid -d <DomainID> on the SC.

-Back these files up. If lost, Sun Support will have to recreate the

files for you, or walk you through the process.

Example:

1.

sysid -d A

IDPROM in /var/opt/SUNWSMS/data/A/idprom.image for domain A

Format = 0x01

Machine Type = 0x82

Ethernet Address = 0:0:ae:a6:4a:42

Manufacturing Date = Wed Jun 27 14:58:00 MDT 2001

Serial number (Host ID) = 0xa64a42 (11029056)

Checksum = 0xac

/var/opt/SUNWSMS/doors/mand

-This is the doors for mand.

-Located on the SCs.

-This allows other processes to invoke mand methods. For example fomd

instructing mand to enter MAIN mode.

It is important to note that this is not a file, simply it is a file

structure which allows interprocess

communication.

Example:

# file /var/opt/SUNWSMS/doors/mand

/var/opt/SUNWSMS/doors/mand: door to mand[18356]

/etc/hosts

-This is the standard Solaris[TM] hosts listing.

-Located on SCs and domains in the platform.

-It contains the certain hosts specific to the MAN Networks.

Page 4 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 5: SunFire 15-25K Domains

Example (from same SC as shown in the MAN.cf example above):

cat /etc/hosts

#

# Internet host table

#

127.0.0.1 localhost

192.68.66.148 sc0 loghost

13.1.1.2 15k-a #smsconfig-entry#

13.1.1.3 15k-b #smsconfig-entry#

13.1.1.4 15k-c #smsconfig-entry#

13.1.1.5 15k-d #smsconfig-entry#

13.1.1.6 15k-e #smsconfig-entry#

13.1.1.7 15k-f #smsconfig-entry#

13.1.1.8 15k-g #smsconfig-entry#

13.1.1.9 15k-h #smsconfig-entry#

13.1.1.10 15k-i #smsconfig-entry#

13.1.1.11 15k-j #smsconfig-entry#

13.1.1.12 15k-k #smsconfig-entry#

13.1.1.13 15k-l #smsconfig-entry#

13.1.1.14 15k-m #smsconfig-entry#

13.1.1.15 15k-n #smsconfig-entry#

13.1.1.16 15k-o #smsconfig-entry#

13.1.1.17 15k-p #smsconfig-entry#

13.1.1.18 15k-q #smsconfig-entry#

13.1.1.19 15k-r #smsconfig-entry#

192.68.66.172 sc #smsconfig-entry#

13.1.1.1 sc-i1 #smsconfig-entry#

192.68.66.250 sc0-C1-failover #smsconfig-entry# 192.68.66.170 sc0-eri1

#smsconfig-entry#

192.68.66.171 sc0-hme0 #smsconfig-entry#

10.2.1.1 sc0-i2 #smsconfig-entry#

10.2.1.2 sc1-i2 #smsconfig-entry#

/etc/hostname.*

-This file shows the dman0, scman0, and scman1 varieties.

-Located on the SCs (dman0 being on domains only).

-They are referenced by drivers when the interfaces are plumbed during

the boot process.

Example from an SC:

# cat /etc/hostname.hme0

sc0

# cat /etc/hostname.scman0

13.1.1.1 netmask + private up

# cat /etc/hostname.scman1

10.2.1.1 netmask + private up

Example from a Domain:

# cat /etc/hostname.dman0

13.1.1.2 netmask + broadcast + private up

/etc/ipnodes

-This is the standard Solaris hosts listing which contains relevant IPv6

hosts (if used).

-Located on SCs and domains where appropriate.

/etc/netmasks

-This is the standard Solaris netmasks settings file.

-Located on system controllers and domains.

-It contains netmasks relevant to the MAN Networks.

Example:

1.

Page 5 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 6: SunFire 15-25K Domains

cat netmasks

#

2.

The netmasks file associates Internet Protocol (IP) address

3.

masks with IP network numbers.

#

4.

network-number netmask

#

5.

The term network-number refers to a number obtained from the Internet

Network

6.

Information Center. Currently this number is restricted to being a

class

7.

A, B, or C network number. In the future we should be able to support

8.

arbitrary network numbers per the Classless Internet Domain Routing

9.

guidelines.

#

10.

Both the network-number and the netmasks are specified in

11.

"decimal dot" notation, e.g.:

#

12.

128.32.0.0 255.255.255.0

#

192.168.66.0 255.255.255.0

192.168.103.0 255.255.255.224

192.168.103.32 255.255.255.252

13.1.1.0 255.255.255.224

10.2.1.0 255.255.255.252

Worth noting also it that if you do not have a default router for your

settings, you will notice it will take longer time for IPMP to start up

properly. This may have a negative effect in the essense that the IPMP

negotiation might take longer time and SMS might experience difficulties

during startup.

Internal Only: Top

References:

- Infodoc 82102 "Sun Fire[TM] 12K/15K/E20K/E25K: Supported Ethernet settings

for System Controllers"

- Infodoc 73002 "Sun Fire[TM] 12K/15K: MAN Interface Mapping"

- SRDB 48123 "Sun Fire[TM] 12K/15K: Troubleshooting the I1 MAN Network"

- Troubleshooting Article 72578 "Sun Fire[TM] 12K/15K: Troubleshooting MAN I2

Network."

- SRDB 48144 "Sun Fire[TM] 15K: IDPROM layout for OpenBoot[TM] PROM failed"

This shows how to recreate lost/corrupted idprom.image files, as does the

site,

http://has.central.sun.com/starcat/15kinfo/faq/recreate_idprom.image.html.

================================================================================

15/25k domain commands

frame1-sc0:sms-svc:3> help

usage:

Page 6 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 7: SunFire 15-25K Domains

addboard -d domain_indicator [-c function] [-r retry_count [-t timeout]]

[-q] [-f] [-y|-n] location...

addboard -h

usage:

addcodlicense <license-signature>

addcodlicense -h

usage:

addtag -d domain_indicator [-q] [-y|-n] new_tag

addtag -h

usage:

cancelcmdsync cmdsync_descriptor

cancelcmdsync -h

usage:

/opt/SUNWSMS/SMS1.5/bin/console -d domain_indicator [[-f] | [-l] | [-g] | [-r]] [-e escapeChar]

/opt/SUNWSMS/SMS1.5/bin/console -h

usage:

deleteboard [-c function] [-r retry_count [-t timeout]] [-q] [-f] [-y|-n] location...

deleteboard -h

usage:

deletecodlicense [-f] <license-signature>

deletecodlicense -h

usage:

deletetag -d domain_indicator [-q] [-y|-n]

deletetag -h

usage:

disablecomponent [-d domain_indicator] [-i "reason"]

location...

disablecomponent [-h]

usage:

enablecodboard [-y|-n] [location]

enablecodboard -h

usage:

enablecomponent [-a | -d domain_indicator] location...

enablecomponent [-h]

usage:

flashupdate -d domain_indicator -f path [-q|-v] [-y|-n]

flashupdate -f path [-q|-v] [-y|-n] location...

flashupdate -h

path : Image file name.

location: slotID[/fp[0|1]]

Page 7 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 8: SunFire 15-25K Domains

usage:

help [command_name]

help -h

usage:

initcmdsync script_name [parameters]

initcmdsync -h

usage:

moveboard -d domain_indicator [-c function] [-r retry_count [-t timeout]]

[-q] [-f] [-y|-n] location

moveboard -h

usage:

poweroff [-q] [-y|-n] [location]

poweroff -h

usage:

poweron [-q] [-y|-n] [location]

poweron -h

Usage:

rcfgadm -d domain_indicator [-f] [-y|-n] [-v] [-o hardware_options]

-c function [-r retry_count [-T timeout] ] ap_id...

rcfgadm -d domain_indicator [-f] [-y|-n] [-v] [-o hardware_options]

-x hardware_function ap_id...

rcfgadm -d domain_indicator [-v] [-a] [-s listing_options]

[-o hardware_options] [-l [ap_id|ap_type...]]

rcfgadm -d domain_indicator [-v] [-o hardware_options] -t ap_id...

rcfgadm -d domain_indicator [-v] [-o hardware_options] -h

[ap_id|ap_type...]

usage:

reset -d domain_indicator[,domain_indicator]...

[-d domain_indicator[,domain_indicator]...]... [-q] [-y | -n] [-x]

reset -h

usage:

resetsc [-q] [-y|-n]

resetsc -h

usage:

runcmdsync script_name [parameters]

runcmdsync -h

usage:

savecmdsync -M identifier cmdsync_descriptor

savecmdsync -h

usage:

setbus [-q] [-y|-n] -c cs0|cs1|cs0,cs1 [-b buses] [location...]

setbus -h

Page 8 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 9: SunFire 15-25K Domains

usage:

setdatasync [-i interval] schedule filename

setdatasync push filename

setdatasync cancel filename

setdatasync backup

setdatasync -h

usage:

setdate [-d domain_indicator] [-u] [-q] [mmdd]HHMM|mmddHHMM[cc]yy[.SS]

setdate -h

usage:

setdefaults [-d domain_indicator [-p]] [-y|n]

setdefaults -h

usage:

setfailover on|off|force

setfailover -h

usage:

setkeyswitch -d domain_indicator [-q] [-y|-n]

position

setkeyswitch -h

usage:

setobpparams -d domain_indicator param=value...

setobpparams -h

usage:

setupplatform

setupplatform -p available [-d domain_indicator [-a|-r] location...]

setupplatform -p cod [headroom | -d domain_indicator domainRTU]

setupplatform [-d domain_indicator -]

setupplatform -h

usage:

showboards [-d domain_indicator] [-v]

showboards [-d domain_indicator] -c

showboards -h

usage:

showbus [-v]

showbus -h

usage:

showcmdsync [-v]

showcmdsync -h

usage:

showcodlicense [-r] [-v]

showcodlicense -h

Page 9 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 10: SunFire 15-25K Domains

usage:

showcodusage [-v] [-p <resource|domains>]

showcodusage -h

usage:

showcomponent [-a | -d domain_indicator] [-v] [location...]

showcomponent [-h]

usage:

showdatasync [-l|-Q] [-v]

showdatasync -h

usage:

showdate [-d domain_indicator] [-u] [-v]

showdate -h

usage:

showdevices [-v] [-p bydevice|byboard|query|force] location...

showdevices [-v] [-p bydevice|byboard]

-d domain_indicator

showdevices -h

usage:

showenvironment [-d domain_indicator[,domain_indicator]...]... [-p

temps|volts|currents|fans|powers[,temps|volts|currents|fans|powers]...]...

[-v]

showenvironment [-d domain_indicator[,domain_indicator]...]... [-p faults]

[-v]

showenvironment -h

usage:

showfailover [-r] [-v]

showfailover -h

usage:

showkeyswitch -d domain_indicator [-v]

showkeyswitch -h

usage:

showlogs [ -F ] [ -f filename ] [ -d domain_indicator ] [ -p m|c|s ] [ -v

]

showlogs [ -F ] [ -f filename ] [ -d domain_indicator ] [ -E ]

[ -p e [ <event_class> | list | ereport | ena0x<ena_value>

| uuid<uuid_value> | <event_code> ] [ <number> ] ]

showlogs -h

usage:

showobpparams -d domain_indicator [-v]

showobpparams -h

Usage:

Page 10 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 11: SunFire 15-25K Domains

showplatform [-d domain_indicator] [-p report] [-v]

showplatform -h

usage:

showxirstate -d domain_indicator [-v]

showxirstate -f filename [-v]

showxirstate -h

Back up the SMS environment

Usage:

smsbackup directory_name

smsbackup -h

Configure the SMS environment

Usage:

smsconfig -m

smsconfig -m I1 [ <domain_id> | sc | netmask ]

smsconfig -m I2 [ sc0 | sc1 | netmask ]

smsconfig -m L

smsconfig -g

smsconfig -a -u <username> -G <platform_role> platform

smsconfig -r -u <username> -G <platform_role> platform

smsconfig -a -u <username> -G <domain_role> <domain_id>

smsconfig -r -u <username> -G <domain_role> <domain_id>

smsconfig -s <security_option> *DEPRECATED*

smsconfig -l <domain_id>

smsconfig -l platform

smsconfig -v

smsconfig -h

usage:

smsconnectsc [-y|-n]

smsconnectsc -h

Restore up the SMS environment

Usage:

smsrestore filename

smsrestore -h

Change the active SMS version

Usage:

smsversion new_version

smsversion -t

smsversion -h

================================================================================

F6800 Configuration Recommendations

--------------------------------------------------------------------------------

Purpose

The purpose of this document is to examine Sun Fire Midframe servers and present

recommendations for exploiting their capabilities and avoiding issues that cause down time.

Assumptions

We'll begin by discussing a number of alternatives while making recommendations.

The first of these is to evaluate alternatives and make decisions that improve the Reliability,

Availability, and Serviceability of your platforms.

The primary goal of system administration is to provide stable, available platforms and services

so users can accomplish work in a predictable manner. It is necessary to choose options that

improve availability, thereby avoiding or reducing periods of non-availability due to planned or

unplanned events.

Page 11 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 12: SunFire 15-25K Domains

The second alternative is to optimize performance. Only after achieving a high degree of RAS

(Reliability, Availability, Serviceability) should considerations turn to improving system and

application performance.

Dynamic Reconfiguration

A core function of the Solaris Operating Environment, Dynamic Reconfiguration (DR) enables

you to safely add and remove CPU/Memory boards and I/O assemblies while the system is

operating. DR controls the software aspects of dynamically changing the hardware used by a

domain with minimal disruption to user processes running in the domain.

You can use DR to do the following:

� Shorten the interruption to system applications while installing or removing a board.

� Disable a failing device by removing it from the logical configuration before its failure

can cause an operating system outage.

� Display the operational status of boards within the system.

� Initiate testing of a board within a domain while the domain continues to run.

� Add or remove resources in a domain while the domain continues to run.

� Invoke hardware specific functions of a board or related attachment.

In mission or business critical environments where availability is one of the most important

criteria, the real value of DR is the ability to perform numerous maintenance activities without

the need for downtime.

F6800 Configuration Recommendations

Memory Utilization and Configuration

On the Sun Fire Midframe server, memory controllers are implemented on the UltraSparc III

CPU and the memory is directly associated to the CPU.

When populating a Uniboard or server with memory, the following rules and recommendations

should be noted:

� A bank needs to be completely filled with the same size DIMM.

� Memory can only be placed into a bank associated with a CPU.

� Do not fill the second bank of a CPU until all of the other CPU's on that board have had

their first banks filled.

� Within a domain, do not fill the second bank of a CPU until all of the other system boards

have had all of their first banks filled.

� Wherever possible, use only the same size DIMM to fill all of the banks on a system

board.

� Wherever possible, populate all system boards within a domain with the same amount of

memory.

� Ensure that the interleave-scope is set to within-board and interleave-mode is set to

optimal using the setupdomain command.

� When constructing domains using system boards of unequal memory size, always define

the domain from smallest installed memory to largest memory.

For example, if SB0 contains 8G of memory and SB2 and SB4 each contain 4G, issue the

following commands to define the domain (addboard -d A SB2 SB4 SB0). This will

ensure that SB2 is assigned as slice 0 and will contain the Solaris Operating

Environment when the domain is booted until the slice is reassigned to another board as

a result of a DR operation.

Memory within a system board is either permanent or non-permanent. Each is handled in a

different manner when performing DR operations.

Permanent memory is memory which is non-pageable, such as the kernel or OpenBoot PROM.

Page 12 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 13: SunFire 15-25K Domains

All other memory is referred to as non-permanent memory.

During DR operations involving memory, all memory must be unconfigured from the affected

system board. In the case of non-permanent memory, the memory can simply be flushed back to

disk, moved to another memory location, or swapped out. Permanent memory, on the other

hand, must be moved to another board of the same total memory size or larger using a copy-

F6800 Configuration Recommendations

rename technique. The reason for the different mechanism is that the permanent memory cannot

be swapped out since it contains critical kernel structures that control the operation of the Solaris

Operating Environment.

Frame Configuration

The Sun Fire 6800 Midframe server is capable of being configured into multiple Segments

(partitions) and Domains. A domain is the logical subset of system resources running an

independent copy of the Solaris Operating Environment. Additionally, the power distribution

within the platform lends itself to another form of partitioning.

A segment or partition is established by the system controller for the purpose of logically

isolating connections between segments such that the failure of a domain in one segment should

not normally affect a domain running in the other segment.

While it is possible for each segment to contain two domains, the isolation of errors is the

primary reason that segmentation is recommended for multiple domain configurations. An

exception to this rule is the use of a temporary domain to facilitate DR operations.

Segmentation is not, however, capable of preventing all failures in one segment from affecting

domains in another segment. For example, the failure of a centerplane may not be contained to a

single segment.

When segmenting a 6800, it should also be noted that each segment is assigned a pair of

Fireplane Repeater Boards. A minimum of two Fireplane Repeater Boards are required for the

operation of a domain. Failure of a repeater board will cause failure of any domain within that

segment until it is replaced.

If DR operations are to be supported on a 6800 with one domain, our recommendation is the

system be kept in single-segment mode and that the Solaris Operating Environment be loaded on

domain A. This leaves domain B available for performing DR and POST testing as required for

the DR of I/O assemblies. If two domains are required, configure the box into a dual-segment

mode and load Solaris onto Domains A and C. This will leave domains B and D available for

use in DR operations.

Sun Fire 6800 Single-Segment Domains

RP0 RP1 RP2 RP3

Segment 0

Domain A

Domain B

F6800 Configuration Recommendations

Sun Fire 6800 Dual-Segment Domains

RP0 RP1 RP2 RP3

Segment 0 Segment 1

Domain A Domain C

Domain B Domain D

The Sun Fire 6800 Midframe server differs from all other Sun Fire models in that it has two

separate internal power grids, each supplied from a different RTU.

To ensure the loss of a power feed does not cause the loss of a domain, it is essential that

domains be constructed with all of the devices powered from the same grid (RTU). The boards

within the 6800 are separated as follows:

Power Grid 0 Power Grid 1

SB0 SB1

SB2 SB3

Page 13 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 14: SunFire 15-25K Domains

SB4 SB5

IB6 IB7

IB8 IB9

RP0 RP2

RP1 RP3

================================================================================

Document Audience: SPECTRUM

Document ID: 51772

Title: Sun Fire[TM] 12K/15K/E20K/E25K: Remote Dynamic Reconfiguration (DR)

generates "DCA/DCS Communication Error" and showdevices is Unable to get

device information from domain.

Copyright Notice: Copyright � 2006 Sun Microsystems, Inc. All Rights

Reserved

Update Date: Tue Sep 19 00:00:00 MDT 2006

Products: Sun Fire 15K Server, Sun Fire 12K Server, Sun Fire E25K

Server, Sun Fire E20K Server

Technical Areas: System Management

Keyword(s):starcat, 12k, 15k, dca, dcs, dr, DCA/DCS communication error,

E20K, E25K, Unable to get device information from domain, rcfgadm,

showdevices, console

Problem Statement Top

The rcfgadm or showdevices commands, generate errors from the system

controller (SC). The error message might be "DCA/DCS Communication Error"

when executing these commands. The command showdevices might generate the

following error (where x is the domain ID):

# showdevices -v -d x

Unable to get device information from domain x

This showdevices error could also be seen in Explorer data from the Main

System Controller (SC). The file, showdevices_-v_-d_x.out, which is in the

/explorer/sf15k/<Domain_ID> directory of Explorer will show the same "Unable

to get device information from domain x" error message.

Also the following messages might be logged in the platform log file on the

SC ( $SMSVAR/adm/platform/messages )

Jul 12 14:07:31 2005 xcat-sc0 showdevices[7496]: [0 2197706244444996 ERR

ri_init.cc 85] rcfgaRequestProxy->ri_init failed. Status= 4315

Jul 12 14:07:31 2005 xcat-sc0 showdevices[7496]: [4509 2197706254650079 ERR

RcfgaCallback.cc 521] server accept failed. RcfgaCallback::serverAccept:

failed in ioctl domain id = I

Resolution Top

To resolve the problem:

*

ensure that the network interfaces are properly configured and running,

*

verify that relevant parameters are not commented out of key files,

*

verify that the appropriate daemons are running.

scman0 and dman0

The dca <> dcs handshaking takes place over the I1 network. This means that

scman0 on the SC and dman0 on the domain must be configured and running

properly. This is often overlooked, so be sure to verify this information

with the following command:

# ifconfig -a

Page 14 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 15: SunFire 15-25K Domains

On SC:

scman0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4>

mtu 1500 index 3

inet 10.10.1.1 netmask ffffffe0 broadcast 10.10.1.31

On domain:

dman0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4>

mtu 1500 index 3

inet 10.10.1.3 netmask ffffffe0 broadcast 10.10.1.31

ether 0:0:be:a8:17:57

Note that the IP addresses and netmasks on the dman0 and scman0 interfaces

should match the information stored in the /etc/SUNWMSMS/config/MAN.cf file

on the SC.

This should be further confirmed by running the following command on the

domain:

# ndd /dev/dman man_get_hostinfo

manc_magic = 0x4d414e43

manc_version = 01

manc_csum = 0x0

manc_ip_type = AF_INET

manc_dom_ipaddr = 10.10.1.3

manc_dom_ip_netmask = 255.255.255.224

manc_dom_ip_netnum = 10.10.1.0

manc_sc_ipaddr = 10.10.1.1

manc_dom_eaddr = 0:0:be:a8:17:57

manc_sc_eaddr = 8:0:20:fa:5f:1a

manc_iob_bitmap = 0xa0 io boards = 5.1, 7.1,

manc_golden_iob = 5

Domain Configuration Agent (DCA)

The Domain Configuration Agent (DCA) daemon runs on the SC, one per domain.

Similar to a netcon session on a Sun Enterprise[TM] 10000 server, the DCA

provides communication between the DCA on the SC and the Domain Configuration

Server (DCS) on the specified domain. If DCA is not running, the showdevices

and the rcfgadm commands fail.

To verify that DCA is running, issue the following command on the SC:

# ps -ef | grep dca

sms-dca 1614 361 0 Feb 26 ? 0:00 dca -d A

Domain Configuration Server (DCS)

DCS is a domain daemon process that supports remote dynamic reconfiguration.

DCS must also be running on the domain in order for the showdevices or

rcfgadm commands to work on the domain.

If either command fails, check the domain for the following lines in the

/etc/inetd.conf file:

sun-dr stream tcp wait root /usr/lib/dcs dcs

sun-dr stream tcp6 wait root /usr/lib/dcs dcs

These lines must be in the /etc/inetd.conf file for the rcfgadm and

showdevices commands to work properly. If the lines are not in the file, and

showdevices fails from the SC, add the indicated lines above and restart the

inetd process as follows:

Page 15 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 16: SunFire 15-25K Domains

# ps -ef | grep inetd

root 151 1 0 Mar 11 ? 0:00 /usr/sbin/inetd -s

# kill -HUP 151

For additional information, refer to the man page about dcs.

Note for domains running Solaris[TM] 10 (without patch 120253-02 ):

The /etc/inetd.conf file is no longer directly used to configure inetd. inetd

is now configured in the Service Management Facility. You can get the list of

the list of all the SMF services installed.

# inetadm

ENABLED STATE FMRI

enabled online svc:/application/font/stfsloader:default

[output omitted]

disabled disabled svc:/network/talk:default

enabled online svc:/platform/sun4u/dcs:default

[output omitted]

The /platform/sun4u/dcs service must be enabled/online.

You can now get more information from the svc:/platform/sun4u/dcs service and

list its properties via the svccfg command :

# /usr/sbin/svccfg -s svc:/platform/sun4u/dcs:default listprop

general framework

general/enabled boolean true

restarter framework NONPERSISTENT

restarter/auxiliary_state astring none

restarter/next_state astring none

restarter/state astring online

restarter/state_timestamp time 1117463395.870876000

restarter/contract count 94

inetd_state framework NONPERSISTENT

inetd_state/cur_state integer 1

inetd_state/next_state integer 13

inetd_state/start_pids integer

svc:/platform/sun4u/dcs:default> quit

If any dcs processes are running, pids will be reported in

inetd_state/start_pids.

Note that, on domains running Solaris 10 w/o 120253-02, the dcs process will

not be running if the SC has not recently communicated with the domain. It's

forked by inetd upon request (Remote DR request started from the SC). Hence,

the PPID for dcs is the inetd PID.

Ex :

# ptree 304

159 /usr/sbin/inetd -s

304 dcs

Note for domains running Solaris[TM] 10 Update 2 (with patch 120253-02 ):

Due to the fixes for :

4792021 per-socket level IPsec policy for dynamic reconfiguration

6380945 Changes required for PSARC 2006/038

Page 16 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 17: SunFire 15-25K Domains

introduced in patch 120253-02, dcs does not belong to inetd any longer.

Since inetd does not support per-socket IPsec, dcs will be changed to run

standalone. Both dcs and cvcd will be controlled by SMF and use SMF

properties to define command line options.

Hence, running inetadm | grep dcs will not return information about dcs any

longer.

Use the following command to get the status from the dcs service :

# svcs dcs

STATE STIME FMRI

online 13:53:40 svc:/platform/sun4u/dcs:default

Note that, on domains running Solaris 10 U2 or w/ 120253-02, the dcs process

starts at boot time. And due to the new implementation, dcs will now be

running with different options and will accept command line arguments ("-a",

"-e", and "-u") allowing the administrator to configure the encryption and

authentication IPsec options. where

- "ah_auth" corresponds to the "-a" option.

- "esp_encr" corresponds to the "-e" option.

- "esp_auth" corresponds to the "-u" option.

See the manpages for dcs(1M) for more details.

Ex :

# ptree 220

220 /usr/lib/dcs -a md5

Note that the dcs process might not be running if the SC has not recently

communicated with the domain.

To check to see if any process is actually listening on the sun-dr port (port

665), run

netstat -an | grep 665

# netstat -an | grep 665

e25ka-dom-c# netstat -an | grep 665

*.665 *.* 0 0 49152 0 LISTEN

*.665 *.* 0 0 49152 0 LISTEN

This verifies that there is indeed some process listening on the sun-dr port,

665. If there is nothing listening on port 665, then the showdevices and

addboard / deleteboard commands on the SC can never work properly.

The /etc/services File

The /etc/services file must also have the following entry on the domain for

remote Dynamic Reconfiguration (DR):

sun-dr 665/tcp # Remote Dynamic Reconfiguration

If you are using the NIS+, make sure that above entry is present in the

/etc/services file of NIS+ server. You can check this using the following

command:

Page 17 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 18: SunFire 15-25K Domains

$ niscat services.org_dir | grep sun-dr

sun-dr sun-dr tcp 665 Remote Dynamic Reconfiguration

The /etc/inet/ipsecinit.conf File on the Domain

The /etc/inet/ipsecinit.conf file should contain the following entries:

{ dport sun-dr ulp tcp } permit { auth_algs md5 }

{ sport sun-dr ulp tcp } apply { auth_algs md5 sa unique }

{ dport cvc_hostd ulp tcp } permit { auth_algs md5 }

{ sport cvc_hostd ulp tcp } apply { auth_algs md5 sa unique }

If the entries do not exist, add them and then issue:

# ipsecconf -a /etc/inet/ipsecinit.conf

Use the following command to check that the system is now running with these

settings:

# ipsecconf

Domain X Server (DXS)

The console command uses DXS. It is similar to the netcon_server on the Sun

Enterprise[TM] 10000 server. DXS runs on the SC, one per domain.

To verify that DXS is running, issue the following command on the SC:

# ps -ef | grep dxs

sms-dxs 1609 361 0 Feb 26 ? 0:57 dxs -d A

sms-dxs 1609 361 0 Feb 26 ? 0:57 dxs -d B

sms-dxs 1609 361 0 Feb 26 ? 0:57 dxs -d C

Console commands take place over the console bus but can be toggled between

the console bus and I1 network using the ~= command.

When the domain is rebooting, a message appears on the SC that is similar to

"dxs disconnecting." The reboot of a domain causes an hpost -Q. which is a

quick POST from the SC.

Sun Fire[TM] 12K/15K/E20K/E25K key management daemon (sckmd)

The sckmd server process resides on a Sun Fire[TM] 12K/15K/E20K/E25K domain.

The sckmd daemon maintains the Internet Protocol Security (IPsec) Security

Associations (SAs) needed to secure the communication between the SC and the

cvcd and dcs daemons running on the domains.

The sckmd daemon must be running on the domain in order for the "showdevices"

or "rcfgadm" commands to work on the domain.

To verify that the sckmd daemon is running, issue the following command on

the domain:

# ps -ef | grep sckmd

root 24156 1 0 Apr 02 ? 0:00 /

usr/platform/SUNW,Sun-Fire-15000/lib/sckmd

Failure after a Solaris[TM] 10+ OS initial installation

Upon the initial installation of a Solaris 10+ domain, showdevices/rcfgadm

will not work successfully. Running the commands will generate domain-side

console messages such as:

Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=ADD, errno=22: Invalid

Page 18 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 19: SunFire 15-25K Domains

argument, diagnostic code=40: Unsupported authentication algorithm

Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=DELETE, errno=3: No such

process, diagnostic code=0: No diagnostic

Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=ADD, errno=22: Invalid

argument, diagnostic code=40: Unsupported authentication algorithm

Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=DELETE, errno=3: No such

process, diagnostic code=0: No diagnostic

Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=ADD, errno=22: Invalid

argument, diagnostic code=40: Unsupported authentication algorithm

Apr 27 13:53:25 xc18-a sckmd: PF_KEY error: type=DELETE, errno=3: Nosuch

process, diagnostic code=0: No diagnostic

To fix this, on the domain, issue the command:

# ipsecalgs -s

For a more detailed explanation on this issue, please see CR #6233334

Failure after a Solaris[TM] 10 Update 2 Installation or after installing

120253-02 on Solaris[TM] 10.

After this installation or patch update, at boot time, dcs has problem to get

online; staying in maintenance mode :

Jul 27 13:50:30 inetd[284]: Unspecified inetd_start method for instance

svc:/platform/sun4u/dcs:default

Jul 27 13:50:30 inetd[284]: Invalid configuration for instance

svc:/platform/sun4u/dcs:default, placing in maintenance

Jul 27 13:50:30 inetd[284]: Invalid configuration for instance

svc:/platform/sun4u/dcs:default, placing in maintenance

# svcs dcs

STATE STIME FMRI

maintenance 13:52:23 svc:/platform/sun4u/dcs:default

Check the reason why dcs never got online via the

/etc/svc/volatile/platform-sun4u-dcs:default.log log file.

# svcs -xv

svc:/platform/sun4u/dcs:default (domain configuration server)

State: maintenance since Thu 20 Jul 2006 13:50:30 AM MEST

Reason: Start method failed repeatedly, last exited with status 1.

See: http://sun.com/msg/SMF-8000-KS

See: man -M /usr/share/man -s 1M dcs

See: /etc/svc/volatile/platform-sun4u-dcs:default.log

Impact: This service is not running.

Page 19 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 20: SunFire 15-25K Domains

To fix this, on the domain, restart the services :

# svcadm disable dcs

# svcadm enable dcs

# svcs dcs

STATE STIME FMRI

online 13:53:40 svc:/platform/sun4u/dcs:default

Since dcs is not available, rcfgadm/showdevices not work successfully.

For a workaround and a more detailed explanation on this issue, please see CR

#6453706.

Failure after an upgrade to Solaris[TM] 10 Update 2.

After an upgrade to Solaris[TM] Update 2, the dcs service may fail to go

online, staying in maintenance mode and the dcs process is not running :

Sep 19 10:57:55 inetd[250]: Property 'name' of instance

svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 10:57:55 inetd[250]: Property 'endpoint_type' of instance

svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 10:57:55 inetd[250]: Property 'isrpc' of instance

svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 10:57:55 inetd[250]: Property 'wait' of instance

svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 10:57:55 inetd[250]: Unspecified inetd_start method for instance

svc:/platform/sun4u/dcs:default

Sep 19 10:57:55 inetd[250]: Invalid configuration for instance

svc:/platform/sun4u/dcs:default, placing in maintenance

# svcs -xv

svc:/platform/sun4u/dcs:default (domain configuration server)

State: maintenance since Tue Sep 19 10:57:55 2006

Reason: Restarter svc:/network/inetd:default gave no explanation.

See: http://sun.com/msg/SMF-8000-9C

See: man -M /usr/share/man -s 1M dcs

Impact: This service is not running.

Since dcs is not available, rcfgadm/showdevices not work successfully.

The new manifest /var/svc/manifest/platform/sun4u/dcs.xml provided by

120253-02 (bundled in S10U2) has not been applied properly so inetd is still

trying to start it. The general/restarter property for the dcs service should

now be startd and no longer be inetd.

# svcprop dcs

general/enabled boolean true

general/entity_stability astring Unstable

general/restarter fmri svc:/network/inetd:default

dcs/ah_auth astring md5

[...]

See CR# 6472374 for more details.

Page 20 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 21: SunFire 15-25K Domains

To fix this problem, the new manifest must be imported using the following

procedure :

# svcs dcs

STATE STIME FMRI

maintenance 10:57:55 svc:/platform/sun4u/dcs:default

# svcadm disable dcs

# Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'name' of instance

# svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'endpoint_type' of

instance svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'isrpc' of instance

svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Property 'wait' of instance

svc:/platform/sun4u/dcs:default is missing, inconsistent or invalid

Sep 19 11:02:13 v4u-15ka-c-epar02 inetd[250]: Unspecified inetd_start method

for instance svc:/platform/sun4u/dcs:default

# svcs dcs

STATE STIME FMRI

disabled 11:02:13 svc:/platform/sun4u/dcs:default

# svccfg -v delete dcs

# svcs dcs

svcs: Pattern 'dcs' doesn't match any instances

STATE STIME FMRI

# svccfg -v import /var/svc/manifest/platform/sun4u/dcs.xml

svccfg: Taking "initial" snapshot for svc:/platform/sun4u/dcs:default.

svccfg: Taking "last-import" snapshot for svc:/platform/sun4u/dcs:default.

svccfg: Refreshed svc:/platform/sun4u/dcs:default.

svccfg: Successful import.

# svcs dcs

STATE STIME FMRI

disabled 11:03:04 svc:/platform/sun4u/dcs:default

# svcadm enable dcs

# svcs dcs

STATE STIME FMRI

online 11:03:20 svc:/platform/sun4u/dcs:default

# svcs -p dcs

STATE STIME FMRI

online 11:03:20 svc:/platform/sun4u/dcs:default

11:03:20 717 dcs

# svcprop dcs

general/enabled boolean false

general/entity_stability astring Unstable

dcs/ah_auth astring md5

[...]

Note that when no general/restarter is mentionned, the default one - startd

is used.

--------------------------------------------------------------------------------

Document Audience: SPECTRUM

Document ID: 76047

Title: Sun Fire[TM] 12K/15K/E20K/E25K Servers: How to Replace a System Board

Using Dynamic Reconfiguration

Update Date: Fri May 13 00:00:00 MDT 2005

Products: Sun Fire 15K Server, Sun Fire 12K Server, Sun Fire E25K

Server, Sun Fire E20K Server

Technical Areas: System Controller, System Board, Firmware, Hardware

Troubleshooting

Keyword(s):DR, 12K, 15K, flashupdate, addboard, deleteboard, cfgadm. rcfgadm,

Sun Fire, 20K, 25K

Description

This document provides methods for replacing system boards on a running

domain:

Page 21 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 22: SunFire 15-25K Domains

-Method 1 utilizes the deleteboard and addboard commands on the

SystemController.

-Method 2 uses the cfgadm command within the domain.

-Method 3 uses the rcfgadm (remote cfgadm) command on the System Controller.

Note: When working on a security hardened system "Method 2" must be used.

Document Body

Method 1

This method uses the System Controller to replace the system board.

For this method, the System Controller issues all the commands.

1) Logically remove the board (unconfigures, disconnects, and powers off the

board).

/opt/SUNWSMS/bin/deleteboard -c unassign SB#

2) Physically remove the board.

It is safe to remove the board when the amber LED is on.

3) Physically install a new board.

4) Power on a new board:

/opt/SUNWSMS/bin/poweron SB#

5) Match firmware to existing boards:

/opt/SUNWSMS/bin/flashupdate -f

/opt/SUNWSMS/hostobjs/sgcpu.flash -v -y SB#

6) Power off the new board:

/opt/SUNWSMS/bin/poweroff SB#

7) Logically bring the board back into the system:

/opt/SUNWSMS/bin/addboard -d <domain> -c configure SB#

The board is now back in the system, and running the Solaris Operating

System.

From the domain, verify with prtdiag.

__________________________________________

Method 2

Method 2 replaces the system board from within the domain. For this method,

the

domain issues most of the commands. However, the System Controller issues

two

of the commands.

1) Logically remove the board (unconfigures, disconnects, and powers off

the board):

/usr/sbin/cfgadm -v -c disconnect SB#

2) Physically remove the board.

It is safe to remove the board when the amber LED is on.

3) Physically install the new board.

4) Issue the command on the System Controller to power on the new

board:

/opt/SUNWSMS/bin/poweron SB#

Page 22 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 23: SunFire 15-25K Domains

5) Issue the command on the System Controller to match the firmware to

existing boards:

/opt/SUNWSMS/bin/flashupdate -f /opt/SUNWSMS/hostobjs/sgcpu.flash -v

-y SB#

6) Issue the command on the System Controller to power off the new board:

/opt/SUNWSMS/bin/poweroff SB#

7) Logically bring the board back into the system:

/usr/sbin/cfgadm -v -c configure SB#

The board is now back in the system, and running the Solaris Operating

System.

From the domain, verify with prtdiag.

__________________________________________

Method 3

Method 3 uses the System Controller, which issues the cfgadm command, to

replace

the system board. This method is similar to Method 2, except the domain

needs to

be specified with the -d option.

1) Logically remove the board (unconfigures, disconnects, and powers off

the board):

/opt/SUNWSMS/bin/rcfgadm -d <domain> -v -c disconnect SB#

2) Physically remove the board.

It is safe to remove the board when the amber light is on.

3) Physically install the new board.

4) Power on the new board:

/opt/SUNWSMS/bin/poweron SB#

5) Match firmware to existing boards:

/opt/SUNWSMS/bin/flashupdate -f

/opt/SUNWSMS/hostobjs/sgcpu.flash -v -y SB#

6) Power off the new board:

/opt/SUNWSMS/bin/poweroff SB#

7) Logically bring the board back into the system:

/opt/SUNWSMS/bin/rcfgadm -d domain -v -c configure SB#

The board is now back in the system, and running Solaris Operating System.

From

the domain, verify with prtdiag.

NOTE: For all three methods above, be aware of kernel cage changes affecting

DR

that were introduced with Solaris 9 KU patch 118558-05.

Reference: Kernel Cage Splitting Overview Infodoc 80991

Would you recommend this Sun site to a friend or colleague?

Page 23 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains

Page 24: SunFire 15-25K Domains

SunSolve Feedback Contact About Sun News Employment Privacy Terms of Use

Trademarks Version 5.18 Copyright 2006 Sun Microsystems, Inc. All Rights

Reserved

--------------------------------------------------------------------------------

Page 24 of 24

2/15/2012http://www.4bcad.com/doc/misc/solaris_domains