环境:RHEL 5.7 + Oracle 10.2.0.5 RAC
很多年前的一套测试环境,今天发现集群无法启动。手工尝试启动crs,集群日志也无任何输出。进一步检查集群配置:
[oracle@rac1-server rac1-server]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 96144
Used space (kbytes) : 3852
Available space (kbytes) : 92292
ID : 1953645605
Device/File Name : /dev/raw/raw14
Device/File integrity check succeeded
Device/File Name : /dev/raw/raw15
Device/File integrity check succeeded
Cluster registry integrity check succeeded
[oracle@rac1-server rac1-server]$ crsctl query css votedisk
0. 0 jy2
located 1 votedisk(s).
确认Votedisk 存在问题,这个jy2不知道是怎么来的,反正是没有有效的votedisk,根据实际环境,我这里尝试加入合法的votedisk后恢复正常:
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl add css votedisk /dev/raw/raw11
Cluster is not in a ready state for online disk addition
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl add css votedisk /dev/raw/raw11 -f
unrecognized parameter -f.
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl add css votedisk /dev/raw/raw11 -force
Now formatting voting disk: /dev/raw/raw11
successful addition of votedisk /dev/raw/raw11.
[root@rac1-server ~]#
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl add css votedisk /dev/raw/raw12 -force
Now formatting voting disk: /dev/raw/raw12
successful addition of votedisk /dev/raw/raw12.
[root@rac1-server ~]#
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl add css votedisk /dev/raw/raw13 -force
Now formatting voting disk: /dev/raw/raw13
Write failed: Broken pipe
因为我测试环境是ssh跳转的,会话断开,再次登陆查询:
[oracle@rac1-server ~]$ crsctl query css votedisk
0. 0 /dev/raw/raw13
1. 0 /dev/raw/raw11
2. 0 /dev/raw/raw12
3. 0 /dev/raw/raw13
发现有两个/dev/raw/raw13,尝试删除:
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl delete css votedisk /dev/raw/raw13 -force
successful deletion of votedisk /dev/raw/raw13.
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl query css votedisk
0. 0 /dev/raw/raw11
1. 0 /dev/raw/raw12
2. 0 /dev/raw/raw13
located 3 votedisk(s).
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl delete css votedisk /dev/raw/raw13 -force
successful deletion of votedisk /dev/raw/raw13.
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl query css votedisk
0. 0 /dev/raw/raw11
1. 0 /dev/raw/raw12
located 2 votedisk(s).
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl add css votedisk /dev/raw/raw13 -force
Now formatting voting disk: /dev/raw/raw13
Write failed: Broken pipe
[root@rac1-server ~]# /s01/oracle/product/10.2.0/crs_1/bin/crsctl query css votedisk
0. 0 /dev/raw/raw13
1. 0 /dev/raw/raw11
2. 0 /dev/raw/raw12
不确认这里Write failed: Broken pipe会不会有潜在影响,实际我查询和使用都是正常的。
再次尝试启动crs可以成功。
从集群日志中可以看到正常使用了我们加进去的votedisk:
--节点1集群alert日志:
2019-12-12 13:27:37.806
[cssd(7734)]CRS-1603:CSSD on node rac1-server shutdown by user.
2019-12-12 13:28:15.035
[cssd(13146)]CRS-1605:CSSD voting file is online: /dev/raw/raw13. Details in /s01/oracle/product/10.2.0/crs_1/log/rac1-server/cssd/ocssd.log.
2019-12-12 13:28:15.048
[cssd(13146)]CRS-1605:CSSD voting file is online: /dev/raw/raw11. Details in /s01/oracle/product/10.2.0/crs_1/log/rac1-server/cssd/ocssd.log.
2019-12-12 13:28:15.058
[cssd(13146)]CRS-1605:CSSD voting file is online: /dev/raw/raw12. Details in /s01/oracle/product/10.2.0/crs_1/log/rac1-server/cssd/ocssd.log.
2019-12-12 13:28:22.162
[cssd(13146)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1-server .
2019-12-12 13:28:22.610
[evmd(12526)]CRS-1401:EVMD started on node rac1-server.
2019-12-12 13:28:22.678
[crsd(12662)]CRS-1005:The OCR upgrade was completed. Version has changed from 169870592 to 169870592. Details in /s01/oracle/product/10.2.0/crs_1/log/rac1-server/crsd/crsd.log.
2019-12-12 13:28:22.679
[crsd(12662)]CRS-1012:The OCR service started on node rac1-server.
2019-12-12 13:28:23.757
[crsd(12662)]CRS-1201:CRSD started on node rac1-server.
2019-12-12 13:28:24.172
[crsd(12662)]CRS-1205:Auto-start failed for the CRS resource ora.rac2-server.ASM2.asm. Details in /s01/oracle/product/10.2.0/crs_1/log/rac1-server/crsd/crsd.log.
2019-12-12 13:28:24.199
[crsd(12662)]CRS-1205:Auto-start failed for the CRS resource ora.jy.jy2.inst. Details in /s01/oracle/product/10.2.0/crs_1/log/rac1-server/crsd/crsd.log.
2019-12-12 13:28:36.180
[cssd(13146)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1-server rac2-server .
--节点2集群alert日志:
2019-12-12 13:30:23.828
[cssd(6736)]CRS-1605:CSSD voting file is online: /dev/raw/raw13. Details in /s01/oracle/product/10.2.0/crs_1/log/rac2-server/cssd/ocssd.log.
2019-12-12 13:30:23.845
[cssd(6736)]CRS-1605:CSSD voting file is online: /dev/raw/raw11. Details in /s01/oracle/product/10.2.0/crs_1/log/rac2-server/cssd/ocssd.log.
2019-12-12 13:30:23.870
[cssd(6736)]CRS-1605:CSSD voting file is online: /dev/raw/raw12. Details in /s01/oracle/product/10.2.0/crs_1/log/rac2-server/cssd/ocssd.log.
2019-12-12 13:30:24.768
[cssd(6736)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1-server rac2-server .
2019-12-12 13:30:25.463
[crsd(6199)]CRS-1012:The OCR service started on node rac2-server.
2019-12-12 13:30:25.478
[evmd(6116)]CRS-1401:EVMD started on node rac2-server.
2019-12-12 13:30:27.101
[crsd(6199)]CRS-1201:CRSD started on node rac2-server.
最后检查下集群状态确认正常:
[oracle@rac1-server ~]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.jy.db application ONLINE ONLINE rac2-server
ora....y1.inst application ONLINE ONLINE rac1-server
ora....y2.inst application ONLINE ONLINE rac2-server
ora....SM1.asm application ONLINE ONLINE rac1-server
ora....ER.lsnr application ONLINE ONLINE rac1-server
ora....ver.gsd application ONLINE ONLINE rac1-server
ora....ver.ons application ONLINE ONLINE rac1-server
ora....ver.vip application ONLINE ONLINE rac1-server
ora....SM2.asm application ONLINE ONLINE rac2-server
ora....ER.lsnr application ONLINE ONLINE rac2-server
ora....ver.gsd application ONLINE ONLINE rac2-server
ora....ver.ons application ONLINE ONLINE rac2-server
ora....ver.vip application ONLINE ONLINE rac2-server
[oracle@rac1-server ~]$