Tuesday, October 28, 2014

Replacing hotpluggable disk in IBM Virtual I/O server

An IBM p720 secondary machine encountered a disk failure. it is hosting the Virtual I/O Server's rootvg. Since these are mirrored, hot pluggable SAS disks, replacement is easy.

NOTE: i've switched to root for this.



# unmirrorvg  rootvg hdisk0
0516-1246 rmlvcopy: If hd5 is the boot logical volume, please run 'chpv -c '
        as root user to clear the boot record and avoid a potential boot
        off an old boot image that may reside on the disk from which this
        logical volume is moved/removed.
0516-622 rmlvcopy: Warning, cannot write lv control block data.
0516-1798 lchangevg: Cannot change quorum without losing quorum.
0516-732 chvg: Unable to change volume group rootvg.
0516-1144 unmirrorvg: rootvg successfully unmirrored, user should perform
        bosboot of system to reinitialize boot records.  Then, user must modify
        bootlist to just include:  hdisk3.

# lsvg -p rootvg
rootvg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdisk0            missing           558         556         112..110..111..111..112
hdisk3            active            558         405         107..12..63..111..112
as you can see, hdisk0 state is missing.

# reducevg rootvg hdisk0
0516-016 ldeletepv: Cannot delete physical volume with allocated
        partitions. Use either migratepv to move the partitions or
        reducevg with the -d option to delete the partitions.
0516-884 reducevg: Unable to remove physical volume hdisk0.

why if failed? because of the dump device.

# lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
hd5                 boot       1       1       1    closed/syncd  N/A
hd6                 paging     1       1       1    open/syncd    N/A
paging00            paging     2       2       1    open/syncd    N/A
hd8                 jfs2log    1       1       1    open/syncd    N/A
hd4                 jfs2       1       1       1    open/syncd    /
hd2                 jfs2       8       8       1    open/syncd    /usr
hd9var              jfs2       2       2       1    open/syncd    /var
hd3                 jfs2       10      10      1    open/syncd    /tmp
hd1                 jfs2       20      20      1    open/syncd    /home
hd10opt             jfs2       3       3       1    open/syncd    /opt
hd11admin           jfs2       1       1       1    open/syncd    /admin
livedump            jfs2       1       1       1    open/syncd    /var/adm/ras/livedump
lg_dumplv           sysdump    2       2       1    closed/syncd  N/A
LPAR01sys00         jfs        80      80      1    open/syncd    N/A
VMLibrary           jfs2       18      18      1    open/syncd    /var/vio/VMLibrary
lvsysdump           sysdump    4       4       1    open/syncd    N/A

# bosboot -ad hdisk3
bosboot: Boot image is 46965 512 byte blocks.

At this point, only the system dump lv remains on the failed disk.
so before you delete, create a new one.

LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
hd5                 boot       1       1       1    closed/syncd  N/A
hd6                 paging     1       1       1    open/syncd    N/A
paging00            paging     2       2       1    open/syncd    N/A
hd8                 jfs2log    1       1       1    open/syncd    N/A
hd4                 jfs2       1       1       1    open/syncd    /
hd2                 jfs2       8       8       1    open/syncd    /usr
hd9var              jfs2       2       2       1    open/syncd    /var
hd3                 jfs2       10      10      1    open/syncd    /tmp
hd1                 jfs2       20      20      1    open/syncd    /home
hd10opt             jfs2       3       3       1    open/syncd    /opt
hd11admin           jfs2       1       1       1    open/syncd    /admin
livedump            jfs2       1       1       1    open/syncd    /var/adm/ras/livedump
lg_dumplv           sysdump    2       2       1    closed/syncd  N/A
LPAR01sys00         jfs        80      80      1    open/syncd    N/A
VMLibrary           jfs2       18      18      1    open/syncd    /var/vio/VMLibrary

(heh - i didn't copy that process, but you can use smitty for this).
so the new dump device has been created:

# sysdumpdev -l
primary              /dev/lvsysdump
secondary            /dev/sysdumpnull
copy directory       /var/adm/ras
forced copy flag     TRUE
always allow dump    FALSE
dump compression     ON
type of dump         traditional

# rmlv lg_dumplv
Warning, all data contained on logical volume lg_dumplv will be destroyed.
rmlv: Do you wish to continue? y(es) n(o)? y
rmlv: Logical volume lg_dumplv is removed.

# reducevg rootvg hdisk0


# rmdev -dRl hdisk0
hdisk0 deleted

# diag
REMOVE OR REPLACE DEVICE ATTACHED TO A SCSI HOT SWAP ENCLOSURE DEVICE                                                                                   802485

The following is a list of configured, unconfigured and populated
SCSI Hot Swap Enclosure device slots. Select a slot to remove or
replace the device attached to that slot.
ENSURE THAT NO OTHER HOST IS USING THE DEVICE BEFORE REMOVING IT.

Make selection, use Enter to continue.

                U78AA.001.WZSGJ80-
  ses0            P2-Y1
     slot  2                           [populated]
     slot  3      P2-D5                hdisk1
     slot  4      P2-D6                hdisk2
     slot  5      P2-D9                cd0

                U78AA.001.WZSGJ80-
  ses1            P2-Y1
     slot  1      P2-D1                hdisk3
     slot  2      P2-D2                hdisk4
     slot  3      P2-D3                hdisk5

                                                   +------------------------------------------------------+
                                                   |                                                      |
                                                   |                                                      |
                                                   | The LED should be in the Remove state for the        |
                                                   | selected device.                                     |
                                                   |                                                      |
                                                   | You may now remove or replace the device.            |
                                                   | Use 'Enter' to indicate you are finished.            |
                                                   |                                                      |
                                                   |                                                      |
                                                   |                                                      |
                                                   |                                                      |
                                                   |                                                      |
                                                   |                                                      |
                                                   | F3=Cancel        F10=Exit         Enter              |
F1=Help                                F10=Exit    +------------------------------------------------------+




then pullout the hotswappable drive and place the new one.
run cfgmgr (cfgdev if you are logged as padmin).
it took the same device name as the old one - hdisk0:



extendvg rootvg hdisk0




and mirror rootvg



mirrorvg rootvg hdisk0



run bosboot



bosboot -ad hdisk0




and update the bootlist:




bootlist -m normal hdisk0 hdisk3