没有跨不过的坎,没有过不去的河;没有必胜的秘籍,只有拼命的努力.

花费一天一夜解决RAC故障(二)

上一篇 / 下一篇  2008-07-06 12:23:25 / 个人分类:ORACLE数据库

初步分析:

 

CRS出问题了,重新配置CRS重新配置CRS,然后重新配置节点?

 

OCFS2文件系统入手:

 

rac1-> more /etc/ocfs2/cluster.conf

node:

       ip_port = 7777

       ip_address = 192.168.0.3

       number = 0

       name = rac1

       cluster = ocfs2

 

node:

       ip_port = 7777

       ip_address = 192.168.0.4

       number = 1

       name = rac2

       cluster = ocfs2

 

cluster:

       node_count = 2

       name = ocfs2

 

 

rac2-> more /etc/ocfs2/cluster.conf

node:

       ip_port = 7777

       ip_address = 192.168.0.3

       number = 0

       name = rac1

       cluster = ocfs2

 

node:

       ip_port = 7777

       ip_address = 192.168.0.4

       number = 1

       name = rac2

       cluster = ocfs2

 

cluster:

       node_count = 2

       name = ocfs2

 

两个节点没问题 !继续排查。。。。

检查心跳:

[root@rac1 ~]# /etc/init.d/o2cb status

Module "configfs": Loaded

Filesystem "configfs": Mounted

Module "ocfs2_nodemanager": Loaded

Module "ocfs2_dlm": Loaded

Module "ocfs2_dlmfs": Loaded

Filesystem "ocfs2_dlmfs": Mounted

Checking O2CB cluster ocfs2: Online

 Heartbeat dead threshold: 61

 Network idle timeout: 10000

 Network keepalive delay: 5000

 Network reconnect delay: 2000

Checking O2CB heartbeat: Active

心跳也没问题 !第2个节点同上!

 

使用命令对两个节点进行文件挂载:

mount -t ocfs2 -o datavolume,nointr /dev/sdb1 /ocfs

 

重新检查CRS

 

rac1-> /u01/oracle/product/10.2.0/crs_1/bin/cluvfy stage -post crsinst -n rac1,rac2

 

Performing post-checks for cluster services setup

 

Checking node reachability...

Node reachability check passed from node "rac1".

 

 

Checking user equivalence...

User equivalence check passed for user "oracle".

 

Checking Cluster manager integrity...

 

 

Checking CSS daemon...

Daemon status check passed for "CSS daemon".

 

Cluster manager integrity check passed.

 

Checking cluster integrity...

 

 

Cluster integrity check passed

 

 

Checking OCR integrity...

 

Checking the absence of a non-clustered configuration...

All nodes free of non-clustered, local-only configurations.

 

Uniqueness check for OCR device passed.

 

Checking the version of OCR...

OCR of correct Version "2" exists.

 

Checking data integrity of OCR...

Data integrity check for OCR passed.

 

OCR integrity check passed.

 

Checking CRS integrity...

 

Checking daemon liveness...

Liveness check passed for "CRS daemon".

 

Checking daemon liveness...

Liveness check passed for "CSS daemon".

 

Checking daemon liveness...

Liveness check passed for "EVM daemon".

 

Checking CRS health...

CRS health check passed.

 

CRS integrity check passed.

 

Checking node application existence...

 

 

Checking existence of VIP node application (required)

Check passed.

 

Checking existence of ONS node application (optional)

Check passed.

 

Checking existence of GSD node application (optional)

Check passed.

 

 

Post-check for cluster services setup was successful.

 

说明CRS运行正常了

 

重新检查状态:

rac1-> crs_stat -t

Name          Type          Target   State    Host

------------------------------------------------------------

ora.dbvdb.db  application   ONLINE   UNKNOWN  rac1

ora....b1.inst application   ONLINE   OFFLINE

ora....b2.inst application   ONLINE   OFFLINE

ora....SM1.asm application   ONLINE   UNKNOWN  rac1

ora....C1.lsnr application   ONLINE   UNKNOWN  rac1

ora.rac1.gsd  application   ONLINE   UNKNOWN  rac1

ora.rac1.ons  application   ONLINE   UNKNOWN  rac1

ora.rac1.vip  application   ONLINE   ONLINE   rac1

ora....SM2.asm application   ONLINE   UNKNOWN  rac2

ora....C2.lsnr application   ONLINE   UNKNOWN  rac2

ora.rac2.gsd  application   ONLINE   UNKNOWN  rac2

ora.rac2.ons  application   ONLINE   UNKNOWN  rac2

ora.rac2.vip  application   ONLINE   ONLINE   rac2

和起初的状态好像不太一样了。。。。。

对节点服务进行检查:

rac1-> srvctl status nodeapps -n rac1

VIP is running on node: rac1

GSD is not running on node: rac1

Listener is not running on node: rac1

ONS daemon is not running on node: rac1

服务依旧没有启动。难道ASM出现问题?下步考虑解决GSD服务启动问题。。。。

rac1-> srvctl start asm -n rac1

PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:

CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]

 [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:

CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]

ASM手工启动不了啊?看来是ASM的问题,


TAG:

引用 删除 Guest   /   2008-09-11 09:58:16
5
 

评分:0

我来说两句

显示全部

:loveliness: :handshake :victory: :funk: :time: :kiss: :call: :hug: :lol :'( :Q :L ;P :$ :P :o :@ :D :( :)

日历

« 2008-10-13  
   1234
567891011
12131415161718
19202122232425
262728293031 

数据统计

  • 访问量: 6641
  • 日志数: 132
  • 书签数: 11
  • 建立时间: 2008-06-24
  • 更新时间: 2008-10-12

RSS订阅

Open Toolbar