初步分析:
CRS出问题了,重新配置CRS?重新配置CRS,然后重新配置节点?
从OCFS2文件系统入手:
rac1-> more /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 192.168.0.3
number = 0
name = rac1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.0.4
number = 1
name = rac2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
rac2-> more /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 192.168.0.3
number = 0
name = rac1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.0.4
number = 1
name = rac2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
两个节点没问题 !继续排查。。。。
检查心跳:
[root@rac1 ~]# /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold: 61
Network idle timeout: 10000
Network keepalive delay: 5000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
心跳也没问题 !第2个节点同上!
使用命令对两个节点进行文件挂载:
mount -t ocfs2 -o datavolume,nointr /dev/sdb1 /ocfs
重新检查CRS
rac1-> /u01/oracle/product/10.2.0/crs_1/bin/cluvfy stage -post crsinst -n rac1,rac2
Performing post-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "rac1".
Checking user equivalence...
User equivalence check passed for user "oracle".
Checking Cluster manager integrity...
Checking CSS daemon...
Daemon status check passed for "CSS daemon".
Cluster manager integrity check passed.
Checking cluster integrity...
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Uniqueness check for OCR device passed.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Checking CRS integrity...
Checking daemon liveness...
Liveness check passed for "CRS daemon".
Checking daemon liveness...
Liveness check passed for "CSS daemon".
Checking daemon liveness...
Liveness check passed for "EVM daemon".
Checking CRS health...
CRS health check passed.
CRS integrity check passed.
Checking node application existence...
Checking existence of VIP node application (required)
Check passed.
Checking existence of ONS node application (optional)
Check passed.
Checking existence of GSD node application (optional)
Check passed.
Post-check for cluster services setup was successful.
说明CRS运行正常了!
重新检查状态:
rac1-> crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.dbvdb.db application ONLINE UNKNOWN rac1
ora....b1.inst application ONLINE OFFLINE
ora....b2.inst application ONLINE OFFLINE
ora....SM1.asm application ONLINE UNKNOWN rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
和起初的状态好像不太一样了。。。。。
对节点服务进行检查:
rac1-> srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1
服务依旧没有启动。难道ASM出现问题?下步考虑解决GSD服务启动问题。。。。
rac1-> srvctl start asm -n rac1
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]
ASM手工启动不了啊?看来是ASM的问题,