用43p140实现最简单的hacmp环境(二)

王朝other·作者佚名  2006-11-23
窄屏简体版  字體: |||超大  

要利用两台140实现OPS环境首先需要实现concurrent环境。还是利用以上的硬件环境,对hacmp作如下设置,hacmp软件版本为hacmp escrm 4.4.1。

1。SERVICE网卡

对于concurrent环境,传统的做法是通过3个rg组成,其中两个rg为cascading方式,只包含一个svc ip地址。另一个rg为concurrent环境,包含了concurrent vg。其实现在oracle提供的OPS和RAC HACMP配置建议中,就非常简单,网卡直接设为SVC地址,连standby和boot都不用配。于是把两台140的内置网卡设为nodea_svc,nodeb_svc,在hacmp的toplogy中只设了两个对应的adapter。

2。concurrent vg

手册上说concurrent模式要求concurrent vg的硬盘由ssa硬盘或者raid硬盘组成,而现在共享硬盘只是一块独立的scsi硬盘,这能行吗?带着这个疑问还是继续测试下去。首先建立sharevg,对于ssa硬盘可以把vg建为concurrent capable,而对于其它raid硬盘就不能设concurrent capable为yes,raid硬盘是通过hacmp来实现concurrent共享。基于此,在建sharevg时没有设concurrent capable

3。concurrent rg

配置好sharevg,两个节点两边同步后,在hacmp中建了一个concurrent 模式的rg,包含了只包含一个sharevg。至于app,就没有配,主要想先测试好concurrent环境,app等oracle安装后再配置也不迟。

4。关键部分

以下是整个调试过程的关键部分,在ha两边同步顺利后,两边开始启动ha,由于网卡已开始就设为svc,所以就看不到boot到svc的改变。用lsvg -o 检查共享vg是否varyon,发现没有。检查hacmp.out文件看到如下的错误:

。。。

cl_raid_vg[97] cl_raid_vg[97] lsdev -Cc disk -l hdisk1 -F type

DEVTYPE=scsd

cl_raid_vg[103] grep -qw scsd /usr/es/sbin/cluster/diag/clconraid.dat

cl_raid_vg[106] THISTYPE=disk

cl_raid_vg[106] [[ -z ]]

cl_raid_vg[116] FIRSTTYPE=disk

cl_raid_vg[123] [[ disk = array ]]

cl_raid_vg[128] exit 1

cl_mode3[166] cl_log 485 cl_mode3: Failed concurrent varyon of sharevg\n

cl_log[50] version=1.9

cl_log[92] SYSLOG_FILE=/usr/es/adm/cluster.log

*******

Aug 1 2003 17:42:24 !!!!!!!!!! ERROR !!!!!!!!!!

*******

Aug 1 2003 17:42:24 cl_mode3: Failed concurrent varyon of sharevg because it is not made up of known RAID devices.

cl_mode3[168] STATUS=1

cl_mode3[217] exit 1

。。。

看来手册没有骗我,不过不能就此放弃。要知道hacmp其实是通过脚本和事件来实现,看来得对脚本作点手脚了。

在hacmp系统目录中.../utils中存储了很多运行脚本,其中与现在问题有关的是cl_mode3脚本。以下是该脚本全文(贴出来是希望大家也能看看):

#!/bin/ksh

# IBM_PROLOG_BEGIN_TAG

# This is an automatically generated prolog.

#

#

#

# Licensed Materials - Property of IBM

#

# (C) COPYRIGHT International Business Machines Corp. 1990,2001

# All Rights Reserved

#

# US Government Users Restricted Rights - Use, duplication or

# disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

#

# IBM_PROLOG_END_TAG

# @(#)27 1.9 src/43haes/usr/sbin/cluster/events/utils/cl_mode3.sh, hacmp.events, 43haes_rmo2, rmo2s01b 5/31/01 16:36:46

###################

#

# COMPONENT_NAME: EVENTUTILS

#

# FUNCTIONS: none

#

###################

###################

#

# Name: cl_mode3

#

# Returns:

# 0 - All of the volume groups are successfully varied on/changed mode

# 1 - varyonvg/mode change of at least one volume group failed

# 2 - Zero arguments were passed

#

# This function will place the volume groups passed in as arguments in

# the designated mode .

#

# Arguments: -s Varyon volume group in mode 3 with sync

# -n Varyon volume group in mode 3 without sync

#

# Environment: VERBOSE_LOGGING, PATH

#

###################

PROGNAME=$(basename ${0})

export PATH="$($(dirname ${0})/../../utilities/cl_get_path all)"

[[ "$VERBOSE_LOGGING" = "high" ]] && set -x

[[ "$VERBOSE_LOGGING" = "high" ]] && version='1.9'

HA_DIR="$(cl_get_path)"

if (( $# < 2 )) ; then

# Caller used incorrect syntax

cl_echo 204 "usage: $PROGNAME [-n | -s] volume_groups_to_varyon" $PROGNAME

exit 2

fi

if [[ $1 = "-n" ]] ; then # sync or no sync

SYNCFLAG="-n"

else

SYNCFLAG="" # LVM default is "sync"

fi

if [[ -z ${EMULATE} ]] ; then

EMULATE="REAL"

fi

STATUS=0

set -u

# Get volume groups, past the sync|nosync flag

shift

for vg in $*

do

VGID=$(/usr/sbin/getlvodm -v $vg)

# Check to see if this volume group is already vary'd on

if lsvg -o | fgrep -s -x "$vg" ; then

# Note this and keep going. This could happen legitimately on a

# node up after a forced down.

# Find out if its vary'd on in concurrent mode

if [[ 0 = $(lqueryvg -g $VGID -C) ]] ; then

# No, its not. Now, find out if its defined as concurrent capable

if [[ 0 = $(lqueryvg -g $VGID -X) ]] ; then

# We get here in the case where the volume group is

# vary'd on, but not in concurrent mode, and is not

# concurrent capable. This would be the case for a SCSI

# RAID disk used in concurrent mode.

if ! cl_raid_vg $vg ; then

# This volume group is not made up of known RAID devices

cl_log 485 "$PROGNAME: Failed concurrent varyon of $vg\n\

because it is not made up of known RAID devices." $PROGNAME $vg

STATUS=1

fi

continue

else

# For some obscure reason, the volume group that

# we want to vary on in concurrent mode is

# already vary'd on, in non-concurrent mode.

cl_echo 200 "$PROGNAME: Volume Group "$vg" in non-concurrent mode." $PROGNAME $vg

# Try to recover by varying it off, to be vary'd on in

# concurrent mode below.

if [[ $EMULATE = 'REAL' ]] ; then

if ! varyoffvg $vg

then

# Unable to vary off the volume group - probably because

# its in use. Note error and keep going

cl_log 203 "$PROGNAME: Failed varyonvg $SYNCFLAG -c of $vg." $PROGNAME $SYNCFLAG $vg

STATUS=1

continue

fi

else

cl_echo 3020 "NOTICE The following command was not executed \n"

echo "varyoffvg $vg"

fi

# At this point, the volume group was vary'd off. The

# flow takes over below, and vary's on the volume group

# in concurrent mode.

fi

else

# Since the volume group is already vary'd on in

# concurrent mode, there is really nothing more to do

# with it. Go on to the next one.

continue

fi

fi

# Find out whether LVM thinks this volume group is concurrent

# capable. Note that since the volume group is not vary'd on at this

# point in time, we have to look directly at the VGDA on the

# hdisks in the volume group.

export MODE

for HDISK in $(/usr/sbin/getlvodm -w $VGID | cut -d' ' -f2) ; do

# Check each of the hdisks for a valid mode value. Stop at the

# first one we find.

if MODE=$(lqueryvg -p $HDISK -X) ; then

break

fi

done

if [[ -z $MODE ]] ; then

# If we couldn't pull a valid mode indicator off of any disk in

# the volume group, there is no chance whatsoever that LVM

# will be able to vary it on. Give up on this one.

cl_log 203 "$PROGNAME: Failed varyonvg $SYNCFLAG -c of $vg." $PROGNAME $SYNCFLAG $vg

STATUS=1

elif [[ $MODE = "0" ]] ; then

# LVM thinks that this is not a concurrent capable

# volume group. This is the expected result if this is

# a RAID device treated as a concurrent device

# Check to make sure that this is a known RAID device

if cl_raid_vg $vg ; then

# If this is a known RAID device, attempt to vary it on

# with no reserve, to simulate concurrent mode

if ! convaryonvg $vg ; then

# It was not possible to vary on this volume

# group. Note error and keep going.

STATUS=1

fi

else

# This volume group is not made up of known RAID devices

cl_log 485 "$PROGNAME: Failed concurrent varyon of $vg\n\

because it is not made up of known RAID devices." $PROGNAME $vg

STATUS=1

fi

elif [[ $MODE = "32" ]] ; then

# LVM thinks that this volume group is defined as concurrent

# capable, for the group services based concurrent mode

# try to varyon in concurrent with appropriate sync option

if [[ $EMULATE = "REAL" ]] ; then

if ! varyonvg $SYNCFLAG -c $vg ; then

cl_log 203 "$PROGNAME: Failed varyonvg $SYNCFLAG -c of $vg." $PROGNAME $SYNCFLAG $vg

# note error and keep going

STATUS=1

fi

else

cl_echo 3020 "NOTICE The following command was not executed \n"

echo "varyonvg $SYNCFLAG -c $vg"

fi

else

# Anything else ("1" or "16", depending on the level of LVM)

# indicates that LVM thinks this volume group is

# defined as concurrent capable, for the covert channel based

# concurrent mode.

if cl_raid_vg $vg ; then

# SCSI attached RAID devices are reported as concurrent capable.

# If that is what we have here, try the appropriate varyon

if ! convaryonvg $vg ; then

# It was not possible to vary on this volume

# group. Note error and keep going.

STATUS=1

fi

else

# Its not a concurrent capable RAID device. The only remaining

# supported choice is covert channel based concurrent mode.

if [[ $EMULATE = "REAL" ]] ; then

if ! varyonvg $SYNCFLAG -c $vg ; then

cl_log 203 "$PROGNAME: Failed varyonvg $SYNCFLAG -c of $vg." $PROGNAME $SYNCFLAG $vg

# note error and keep going

STATUS=1

fi

else

cl_echo 3020 "NOTICE The following command was not executed \n"

echo "varyonvg $SYNCFLAG -c $vg"

fi

fi

fi

done

exit $STATUS

读了下该脚本就知道问题关键在于那块共享硬盘在这脚本中是不被承认为raid硬盘,结果返回1,那我们就对最后作简单的修改:

# add for 140 ha escrm

STATUS=0

exit $STATUS

希望能骗过HA。 注意两个节点同一脚本都要修改。

5。重新启动HA

非常令人高兴,HA启动成功,两边用lsvg -l sharevg ,能看到同样内容。

 
 
 
免责声明:本文为网络用户发布,其观点仅代表作者个人观点,与本站无关,本站仅提供信息存储服务。文中陈述内容未经本站证实,其真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
 
 
© 2005- 王朝網路 版權所有  導航