参考文档:

11gR2 Clusterware and Grid Home - What You Need to Know (Doc ID 1053147.1)
诊断 Grid Infrastructure 启动问题 (Doc ID 1623340.1)

Oracle 11gR2 中对CRSD资源进行了重新分类: Local Resources 和 Cluster Resources,可以通过命令crsctl查看:

[root@rac1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       rac1                                         
               OFFLINE OFFLINE      rac2                                         
ora.FRA.dg
               ONLINE  ONLINE       rac1                                         
               OFFLINE OFFLINE      rac2                                         
ora.LISTENER.lsnr
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.OCR_VOTE.dg
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.asm
               ONLINE  ONLINE       rac1                     Started             
               ONLINE  ONLINE       rac2                     Started             
ora.eons
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.gsd
               OFFLINE OFFLINE      rac1                                         
               OFFLINE OFFLINE      rac2                                         
ora.net1.network
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.ons
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
ora.registry.acfs
               ONLINE  ONLINE       rac1                                         
               ONLINE  ONLINE       rac2                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rac1                                         
ora.oc4j
      1        OFFLINE OFFLINE                                                   
ora.rac1.vip
      1        ONLINE  ONLINE       rac1                                         
ora.rac2.vip
      1        ONLINE  ONLINE       rac2                                         
ora.scan1.vip
      1        ONLINE  ONLINE       rac1                                         
ora.test.db
      1        ONLINE  ONLINE       rac1                     Open                
      2        OFFLINE OFFLINE                    ——这里我故意关掉了rac2节点上的数据库实例

对应起来看:Local Resource就是应用层的东西;而Cluster Resource就是集群层的东西了。

我们可以用以下命令查看ohasd管理的资源:

[root@rac1 ~]#  crsctl stat res -init -t             ——在节点1上执行
--------------------------------------------------------------------------------   
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac1                     Started             
ora.crsd
      1        ONLINE  ONLINE       rac1                                         
ora.cssd
      1        ONLINE  ONLINE       rac1                                         
ora.cssdmonitor
      1        ONLINE  ONLINE       rac1                                         
ora.ctssd
      1        ONLINE  ONLINE       rac1                     OBSERVER            
ora.diskmon
      1        ONLINE  ONLINE       rac1                                         
ora.drivers.acfs
      1        ONLINE  ONLINE       rac1                                         
ora.evmd
      1        ONLINE  ONLINE       rac1                                         
ora.gipcd
      1        ONLINE  ONLINE       rac1                                         
ora.gpnpd
      1        ONLINE  ONLINE       rac1                                         
ora.mdnsd
      1        ONLINE  ONLINE       rac1

[root@rac2 ~]#  crsctl stat res -init -t        在节点2上执行
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac2                     Started             
ora.crsd
      1        ONLINE  ONLINE       rac2                                         
ora.cssd
      1        ONLINE  ONLINE       rac2                                         
ora.cssdmonitor
      1        ONLINE  ONLINE       rac2                                         
ora.ctssd
      1        ONLINE  ONLINE       rac2                     OBSERVER            
ora.diskmon
      1        ONLINE  ONLINE       rac2                                         
ora.drivers.acfs
      1        ONLINE  ONLINE       rac2                                         
ora.evmd
      1        ONLINE  ONLINE       rac2                                         
ora.gipcd
      1        ONLINE  ONLINE       rac2                                         
ora.gpnpd
      1        ONLINE  ONLINE       rac2                                         
ora.mdnsd
      1        ONLINE  ONLINE       rac2

可以发现has进程在每个实例上看到和管理的东西是不一样的,也就是说has只管理自己服务器上的进程。我们接下来尝试关闭has进程:

[root@rac1 bin]# ./crsctl stop has

CRS-2791: Starting shutdown of Oracle HighAvailability Services-managed resources on 'rac1'

CRS-2673: Attempting to stop 'ora.crsd' on'rac1'

CRS-2790: Starting shutdown of ClusterReady Services-managed resources on 'rac1'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr'on 'rac1'

CRS-2673: Attempting to stop'ora.OCRVOTING.dg' on 'rac1'

CRS-2673: Attempting to stop 'ora.sdd.db'on 'rac1'

CRS-2673: Attempting to stop'ora.LISTENER.lsnr' on 'rac1'

CRS-2673: Attempting to stop 'ora.oc4j' on'rac1'

CRS-2673: Attempting to stop 'ora.cvu' on'rac1'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr'on 'rac1' succeeded

CRS-2673: Attempting to stop'ora.scan1.vip' on 'rac1'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.rac1.vip'on 'rac1'

CRS-2677: Stop of 'ora.rac1.vip' on 'rac1'succeeded

CRS-2672: Attempting to start'ora.rac1.vip' on 'rac2'

CRS-2677: Stop of 'ora.scan1.vip' on 'rac1'succeeded

CRS-2672: Attempting to start 'ora.scan1.vip'on 'rac2'

CRS-2676: Start of 'ora.scan1.vip' on'rac2' succeeded

CRS-2676: Start of 'ora.rac1.vip' on 'rac2'succeeded

CRS-2672: Attempting to start'ora.LISTENER_SCAN1.lsnr' on 'rac2'

CRS-2677: Stop of 'ora.sdd.db' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg'on 'rac1'

CRS-2673: Attempting to stop 'ora.FRA.dg'on 'rac1'

CRS-2676: Start of'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded

CRS-2677: Stop of 'ora.FRA.dg' on 'rac1'succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'rac1'succeeded

CRS-2677: Stop of 'ora.oc4j' on 'rac1'succeeded

CRS-2672: Attempting to start 'ora.oc4j' on'rac2'

CRS-2677: Stop of 'ora.cvu' on 'rac1'succeeded

CRS-2672: Attempting to start 'ora.cvu' on'rac2'

CRS-2676: Start of 'ora.cvu' on 'rac2'succeeded

CRS-2677: Stop of 'ora.OCRVOTING.dg' on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.asm' on'rac1'

CRS-2677: Stop of 'ora.asm' on 'rac1'succeeded

CRS-2676: Start of 'ora.oc4j' on 'rac2'succeeded

CRS-2673: Attempting to stop 'ora.ons' on'rac1'

CRS-2677: Stop of 'ora.ons' on 'rac1'succeeded

CRS-2673: Attempting to stop'ora.net1.network' on 'rac1'

CRS-2677: Stop of 'ora.net1.network' on'rac1' succeeded

CRS-2792: Shutdown of Cluster ReadyServices-managed resources on 'rac1' has completed

CRS-2677: Stop of 'ora.crsd' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.mdnsd' on'rac1'

CRS-2673: Attempting to stop 'ora.ctssd' on'rac1'

CRS-2673: Attempting to stop 'ora.evmd' on'rac1'

CRS-2673: Attempting to stop 'ora.asm' on'rac1'

CRS-2677: Stop of 'ora.evmd' on 'rac1'succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'rac1'succeeded

CRS-2677: Stop of 'ora.ctssd' on 'rac1'succeeded

CRS-2677: Stop of 'ora.asm' on 'rac1'succeeded

CRS-2673: Attempting to stop'ora.cluster_interconnect.haip' on 'rac1'

CRS-2677: Stop of'ora.cluster_interconnect.haip' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on'rac1'

CRS-2677: Stop of 'ora.cssd' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.crf' on'rac1'

CRS-2677: Stop of 'ora.crf' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on'rac1'

CRS-2677: Stop of 'ora.gipcd' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.gpnpd' on'rac1'

CRS-2677: Stop of 'ora.gpnpd' on 'rac1'succeeded

CRS-2793: Shutdown of Oracle HighAvailability Services-managed resources on 'rac1' has completed

CRS-4133: Oracle High Availability Serviceshas been stopped.

[root@rac1 bin]#

注意:

我这里测试的是Oracle11gR2的环境,我们在节点1上执行该命令,只把节点1上的进程停了,而把相关的资源转移到我们的节点2上了,因此也证实了我们上面的说的,该命令只争对当前服务器有效。

启动HAS

[root@rac1 bin]# ./crsctl start has

CRS-4123: Oracle High Availability Serviceshas been started.

[root@rac1 bin]#

从上面看只是启动了HAS。实际上后面会把Oracle Restart 管理的资源都会启动。这个可以使用crs_stat 命令来进程验证,不过Oracle 11g的进程启动过程比较慢,需要耐心等待。

等关闭has进程后,grid用户下,会有这几个进程被关闭:

[root@rac1 ~]# ps -fu grid
UID        PID  PPID  C STIME TTY          TIME CMD
grid      4899     1  0 22:28 ?        00:00:00 /u01/app/11.2.0/grid/bin/oraagent.bin
grid      4912     1  0 22:28 ?        00:00:00 /u01/app/11.2.0/grid/bin/gipcd.bin
grid      4917     1  0 22:28 ?        00:00:00 /u01/app/11.2.0/grid/bin/mdnsd.bin
grid      4932     1  0 22:28 ?        00:00:00 /u01/app/11.2.0/grid/bin/gpnpd.bin
grid      4992     1  1 22:28 ?        00:00:01 /u01/app/11.2.0/grid/bin/ocssd.bin 
grid      5008     1  0 22:28 ?        00:00:00 /u01/app/11.2.0/grid/bin/diskmon.bin -d -f

关于以上进程的解释如下:

(3)Grid Plug and Play (GPNPD):

Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

(4)Grid Interprocess Communication (GIPC):

A support daemon that enables Redundant Interconnect Usage.

(5)ora.mdns

Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX, and a service on Windows.

(6)Cluster Time Synchronization Service (CTSS):

Provides time management in a cluster for Oracle Clusterware. 在上面的查询结果中,我们看到CTSS 的状态是OBSERVER。即旁观者。

在11gR2中,RAC在安装的时候,时间同步可以用两种方式来实现,一是NTP,还有就是CTSS. 当安装程序发现 NTP 协议处于非活动状态时,安装集群时间同步服务将以活动模式自动进行安装并通过所有节点的时间。如果发现配置了 NTP,则以观察者模式启动集群时间同步服务,Oracle Clusterware 不会在集群中进行活动的时间同步。

(7)Automatic Storage Management Cluster File System (Oracle ACFS):

Oracle Automatic Storage Management Cluster File System (Oracle ACFS) is a multi-platform, scalable file system, and storage management technology that extends Oracle Automatic Storage Management (Oracle ASM) functionality to support customer files maintained outside of Oracle Database. Oracle ACFS supports many database and application files, including executables, database trace files, database alert logs, application reports, BFILEs, and configuration files. Other supported files are video, audio, text, images, engineering drawings, and other general-purpose application file data.

An Oracle ACFS file system is a layer on Oracle ASM and is configured with Oracle ASM storage, as shown in Figure 5-1. Oracle ACFS leverages Oracle ASM functionality that enables:

·         Oracle ACFS dynamic file system resizing

·         Maximized performance through direct access to Oracle ASM disk group storage

·         Balanced distribution of Oracle ACFS across Oracle ASM disk group storage for increased I/O parallelism

·         Data reliability through Oracle ASM mirroring protection mechanisms

[root@rac1 u01]# shcrs_stat.sh

Name                           Target     State     Host

------------------------------ -------------------  -------

ora.DATA.dg                    ONLINE     ONLINE    rac1

ora.FRA.dg                     ONLINE    ONLINE     rac1

ora.LISTENER.lsnr              ONLINE     ONLINE    rac1

ora.LISTENER_SCAN1.lsnr        ONLINE     ONLINE    rac2

ora.OCRVOTING.dg               ONLINE     ONLINE    rac1

ora.asm                        ONLINE     ONLINE    rac1

ora.cvu                        ONLINE     ONLINE    rac2

ora.gsd                        OFFLINE    OFFLINE

ora.net1.network               ONLINE     ONLINE    rac1

ora.oc4j                       ONLINE     ONLINE    rac2

ora.ons                        ONLINE     ONLINE    rac1

ora.rac1.ASM1.asm              ONLINE     ONLINE    rac1

ora.rac1.LISTENER_RAC1.lsnr    ONLINE    ONLINE     rac1

ora.rac1.gsd                   OFFLINE    OFFLINE

ora.rac1.ons                   ONLINE     ONLINE    rac1

ora.rac1.vip                   ONLINE     ONLINE    rac1

ora.rac2.ASM2.asm              ONLINE     ONLINE    rac2

ora.rac2.LISTENER_RAC2.lsnr    ONLINE    ONLINE     rac2

ora.rac2.gsd                   OFFLINE    OFFLINE

ora.rac2.ons                   ONLINE     ONLINE    rac2

ora.rac2.vip                   ONLINE     ONLINE    rac2

ora.scan1.vip                  ONLINE     ONLINE    rac2

ora.sdd.db                     ONLINE     ONLINE    rac2

2.2.3 禁用HAS(Restart)在server 重启后的自动启动

[root@rac1 bin]# ./crsctl disable has

CRS-4621: Oracle High Availability Servicesautostart is disabled.

[root@rac1 bin]#

2.2.4 查看HAS(Restart)的状态

[root@rac1 bin]# ./crsctl config has

CRS-4621: Oracle High Availability Servicesautostart is disabled.

2.2.5 启用HAS(Restart)在server 重启后的自启动

[root@rac1 bin]# ./crsctl enable has

CRS-4622: Oracle High Availability Servicesautostart is enabled.

--查看has的状态,验证刚才命令的效果:

[root@rac1 bin]# ./crsctl config has

CRS-4622: Oracle High Availability Servicesautostart is enabled.

[root@rac1 bin]#

2.2.6 查看Restart 当前状态

[root@rac1 bin]# ./crsctl check has

CRS-4638: Oracle High Availability Servicesis online

2.2.7 查看Oracle Restart 中由OHASD管理的resource 状态

[root@rac1 bin]# ./crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE       SERVER                  STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE  ONLINE      rac1

ONLINE  ONLINE      rac2

ora.FRA.dg

ONLINE  ONLINE      rac1

ONLINE  ONLINE      rac2

ora.LISTENER.lsnr

ONLINE  ONLINE      rac1

ONLINE  ONLINE      rac2

ora.OCRVOTING.dg

ONLINE  ONLINE      rac1

ONLINE  ONLINE      rac2

ora.asm

ONLINE  ONLINE      rac1                    Started

ONLINE  ONLINE      rac2                    Started

ora.gsd

OFFLINE OFFLINE      rac1

OFFLINE OFFLINE      rac2

ora.net1.network

ONLINE  ONLINE      rac1

ONLINE  ONLINE      rac2

ora.ons

ONLINE  ONLINE      rac1

ONLINE  ONLINE      rac2

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1        ONLINE  ONLINE      rac2

ora.cvu

1        ONLINE  ONLINE      rac2

ora.oc4j

1        ONLINE  ONLINE      rac2

ora.rac1.vip

1        ONLINE  ONLINE      rac1

ora.rac2.vip

1        ONLINE  ONLINE      rac2

ora.scan1.vip

1        ONLINE  ONLINE      rac2

ora.sdd.db

1        ONLINE  ONLINE      rac1                     Open

2        ONLINE  ONLINE      rac2                     Open

[root@rac1 bin]#

2.3 使用SRVCTL 命令管理Restart(OHASD)

可以手工的使用SRVCTL 命令来管理Oracle Restart。从Oracle Restart 配置里添加或者删除一些组件。当我们手工的添加一个组件到到Oracle Restart,并使用SRVCTL启用该组件,那么Oracle Restart 就开始管理该组件,并根据需要决定是否对该组件进行重启。

官方文档的说明如下:

SRVCTL Command Reference for Oracle Restart

http://docs.oracle.com/cd/E11882_01/server.112/e25494/restart005.htm

Configuring OracleRestart

http://docs.oracle.com/cd/E11882_01/server.112/e10595/restart002.htm

转:http://blog.csdn.net/cymm_liu/article/details/7955340

05-17 23:30