Saturday, July 26, 2014

Recovery from lost majority of voting disk (CRS-1705)

Error message from /u01/app/11.2.0/grid/log/rac1/alertrac1.log

2014-07-27 00:39:04.870:
[cssd(2446)]CRS-1637:Unable to locate configured voting file with ID a24090f0-28f84f0b-bff82d38-9fbfc140; details at (:CSSNM00020:) in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-07-27 00:39:04.870:
[cssd(2446)]CRS-1637:Unable to locate configured voting file with ID a1ef7dd3-57064f85-bf6a11cb-f508ebb6; details at (:CSSNM00020:) in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-07-27 00:39:04.870:
[cssd(2446)]CRS-1705:Found 1 configured voting files but 2 voting files are required, terminating to ensure data integrity; details at (:CSSNM00021:) in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-07-27 00:39:04.870:
[cssd(2446)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log
2014-07-27 00:39:04.910:
[cssd(2446)]CRS-1603:CSSD on node rac1 shutdown by user.

 

Error message from /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log


2014-07-27 00:39:12.734: [    CSSD][3380029184]clssnmvDiskVerify: discovered a potential voting file
2014-07-27 00:39:12.734: [   SKGFD][3380029184]Handle 0x7f5ab40967a0 from lib :UFS:: for disk :/dev/sde1:

2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmvDiskVerify: Successful discovery for disk /dev/sde1, UID 24fd0190-4c504f7c-bf4d8478-a37d15b9, Pending CIN 0:1406391959:
0, Committed CIN 0:1406391959:0
2014-07-27 00:39:12.738: [   SKGFD][3380029184]Lib :UFS:: closing handle 0x7f5ab40967a0 for disk :/dev/sde1:

2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmvDiskVerify: Successful discovery of 1 disks
2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmCompleteVFDiscovery: Completing voting file discovery
2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmvDiskStateChange: state from discovered to pending disk /dev/sde1
2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmvDiskStateChange: state from pending to configured disk /dev/sde1
2014-07-27 00:39:12.738: [    CSSD][3380029184]clssnmvVerifyCommittedConfigVFs: Insufficient voting files found, found 1 of 3 configured, needed 2 voting files
2014-07-27 00:39:12.738: [    CSSD][3380029184](:CSSNM00020:)clssnmvVerifyCommittedConfigVFs: voting file 0, id a24090f0-28f84f0b-bff82d38-9fbfc140 not found
2014-07-27 00:39:12.738: [    CSSD][3380029184](:CSSNM00020:)clssnmvVerifyCommittedConfigVFs: voting file 2, id a1ef7dd3-57064f85-bf6a11cb-f508ebb6 not found
2014-07-27 00:39:12.738: [    CSSD][3380029184]ASSERT clssnm1.c 3336
2014-07-27 00:39:12.738: [    CSSD][3380029184](:CSSNM00021:)clssnmCompleteVFDiscovery: Found 1 voting files, but 2 are required.  Terminating due to insufficient configured
voting files
2014-07-27 00:39:12.738: [    CSSD][3380029184]###################################
2014-07-27 00:39:12.738: [    CSSD][3380029184]clssscExit: CSSD aborting from thread clssnmvDDiscThread
2014-07-27 00:39:12.738: [    CSSD][3380029184]###################################
2014-07-27 00:39:12.738: [    CSSD][3380029184](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally

 

How to fix it:


[root@rac1 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.


[root@rac1 ~]# crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded

[root@rac1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. OFFLINE  a24090f028f84f0bbff82d389fbfc140 () []
2. ONLINE   24fd01904c504f7cbf4d8478a37d15b9 (/dev/sde1) [OCR_VOTE]
3. OFFLINE  a1ef7dd357064f85bf6a11cbf508ebb6 () []
Located 3 voting disk(s).

[root@rac1 ~]#  kfod op=groups
--------------------------------------------------------------------------------
Group          Size          Free Redundancy Name
================================================================================
   1:       1023 Mb        659 Mb     EXTERN OCRVOTE
   2:      30719 Mb      28516 Mb     EXTERN DATA
  

[root@rac1 ~]# crsctl replace votedisk +OCRVOTE
Successful addition of voting disk 6d1dc365ec824f94bfa7fc18f8e3f6b0.
Successful deletion of voting disk a24090f028f84f0bbff82d389fbfc140.
Successful deletion of voting disk 24fd01904c504f7cbf4d8478a37d15b9.
Successful deletion of voting disk a1ef7dd357064f85bf6a11cbf508ebb6.
Successfully replaced voting disk group with +OCRVOTE.
CRS-4266: Voting file(s) successfully replaced
 

[root@rac1 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.


[root@rac1 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

[root@rac1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac1
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1
ora.OCRVOTE.dg
               ONLINE  ONLINE       rac1
ora.OCR_VOTE.dg
               ONLINE  OFFLINE      rac1
ora.asm
               ONLINE  ONLINE       rac1                     Started
ora.gsd
               OFFLINE OFFLINE      rac1
ora.net1.network
               ONLINE  ONLINE       rac1
ora.ons
               ONLINE  ONLINE       rac1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1
ora.cvu
      1        ONLINE  ONLINE       rac1
ora.oc4j
      1        ONLINE  ONLINE       rac1
ora.orcl.db
      1        ONLINE  ONLINE       rac1                     Open
      2        ONLINE  OFFLINE
ora.rac1.vip
      1        ONLINE  ONLINE       rac1
ora.rac2.vip
      1        ONLINE  INTERMEDIATE rac1                     FAILED OVER
ora.scan1.vip
      1        ONLINE  ONLINE       rac1