一.配置硬件集群
1.最低硬件配置
至少400MB的/tmp空间
至少512MB的物理内存
3倍于物理内存的swap空间(大于1g物理内存时可为2倍)
我想硬盘空间大家不需要太节省,毕竟数据文件是放阵列的,本地硬盘的分区尽量开的大一点吧.
光纤模块,光纤交换机,光纤线(主机与阵列建推荐用光纤,如果用千兆的6类网线的话最大速度也只能
达到30多M每秒,
由于阵列提供了近100M每秒的读速度,使用千兆网线会使网络传输会成为瓶颈.)
2.需要的软件
我这里是redhat 3.0,当然2.1也可以,不过还是推荐3.0,毕竟内核比较新,
不知道2.6的内核能装9204rac否,到时再试一把.
另外需要看看rsh的服务包有没装上
rpm -q rsh-server
rsh-server-0.17-17
如果没有,装上rsh,这个是创建rac必须要有的.
3.补丁
操作系统的补丁尽量打到最新吧,特别是2.1的版本,努力往后打.
4.安装阵列,我这里是netapp的阵列,通过创建卷后在linux client mount nfs实现.
BTW,NETAPP的管理相当便捷.这里配置阵列的ip为10.0.29.152.当然你也可以用emc等阵列,
那就变成raw device上建rac,脱离本文的范畴了.
5.编辑两个节点的ip和/etc/hosts文件
10.0.29.150
wanghai1
192.168.0.150
wanghai1-eth1
10.0.29.152
FAS250
10.0.29.151
wanghai2
192.168.0.151
wanghai2-eth1
6.调整内核网络参数
由于rac cache fusion机制,我们必须调整内核网络参数.
Parameter
Meaning Value
/proc/sys/net/core/rmem_default
The default setting in bytes of the socket receive buffer
262144
/proc/sys/net/core/rmem_max
The maximum socket receive buffer size in bytes
262144
/proc/sys/net/core/wmem_default
The default setting in bytes of the socket send buffer
262144
/proc/sys/net/core/wmem_max
The maximum socket
send
buffer
size
in bytes
262144
调整方法为
$ echo
262144
/proc/sys/net/core/rmem_default
7.配置/etc/fstab来挂载nfs filesystem
这里贴出netapp nfs mount参数
10.0.29.152:/vol/vol1/fas250
/netapp nfs
rw,hard,nointr,tcp,noac,vers=3,timeo=600,rsize=32768,wsize=32768
8.配置rsh,rlogin,rcp等服务,使用/usr/sbin/ntsysv选择rsh,rlogin,rcp,
用/sbin/chkconfig --list|grep on看看rsh等服务有没启动,如果没有,运行/sbin/service xinetd
start.
编辑/home/oracle/.rhost
wanghai1
oracle
wanghai2
oracle
wanghai1-eth1
oracle
wanghai2-eth1
oracle
并测试rsh
[oracle@wanghai2 oracle]$ rsh wanghai1 pwd
/home/oracle
[oracle@wanghai1 oracle]$ rsh wanghai2 pwd
/home/oracle
9.检查有没开启nfs,nfslock的服务,如果没有开启nfslock的话在启动instance的时候会报不能lock控
制文件的错误.
另外注意如果有iptables的服务关掉它,防火墙会给rsh带来麻烦,当然如果你能配置iptables让rsh通过
就ok了.
创建nfs的mount point,mkdir /netapp
10.在nfs上建立共享quorum文件用于记录两节点的active信息
touch /netapp/SharedConfigFile
touch /netapp/CmDiskFile
11.检查hangcheck_timer模块有没被加载,2.4.20以上内核应该包括了hangcheck,如果是2.4.9的内核
可以去metalink下patch.查看hangcheck是否已加载可以用lsmod,如果没发现就insmod.
二.安装OCM
1.创建oinstall组,oracle用户,创建oracle主目录,创建profile文件
Creating Oracle User Accounts
su - root
groupadd oinstall
# group owner of Oracle files
useradd -c "Oracle software owner" -g oinstall oracle
passwd oracle
Creating Oracle Directories
In this example, make sure that the /opt filesystem is large enough, see Oracle Disk Space
for more information. If /opt is not on a separate filesystem, then make sure the root
filesystem "/" has enough space.
su - root
mkdir /opt/oracle
mkdir /opt/oracle/product
mkdir /opt/oracle/product/9.2
chown -R oracle.oinstall /opt/oracle
mkdir /var/opt/oracle
chown oracle.oinstall /var/opt/oracle
chmod 755 /var/opt/oracle
Setting Oracle Environments
Set the following Oracle environment variables before you start runInstaller.
As the oracle user execute the following commands:
# Set the LD_ASSUME_KERNEL environment variable only for Red Hat 9 and
# for Red Hat Enterprise Linux Advanced Server 3 (RHEL AS 3) !!
# Use the "Linuxthreads with floating stacks" implementation instead of NPTL:
export LD_ASSUME_KERNEL=2.4.1
# Oracle Environment
export ORACLE_BASE=/opt/oracle
export ORACLE_HOME=/opt/oracle/product/9.2
export ORACLE_SID=test1
export ORACLE_TERM=xterm
# export TNS_ADMIN= Set if sqlnet.ora, tnsnames.ora, etc. are not in
$ORACLE_HOME/network/admin
export NLS_LANG=AMERICAN;
export ORA_NLS33=$ORACLE_HOME/ocommon/nls/admin/data
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
# Set shell search paths
export PATH=$PATH:$ORACLE_HOME/bin
I successfully installed Oracle9iR2 without setting the following CLASSPATH environment
variable:
# CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
# CLASSPATH=$CLASSPATH:$ORACLE_HOME/network/jlib
# export CLASSPATH
2.运行runInstaller,选择安装9201,去掉所有组件,只安装java环境和Oracle Universal Installer .
退出,再运行runInstaller,选择安装ocm.再退出,再运行runInstaller,选择升级包,升级ocm到9204
(这里多次退出重新运行runInstaller是为了防止Oracle Universal Installer出错)
3.修改$ORACLE_HOME/oracm/admin/cmcfg.ora 文件,把包含watchdog的行都注释掉,因为9204的rac
已经用hangcheck来监控节点的信息了.加上一行KernelModuleName=hangcheck-timer,修改miscount=210
节点1的cmcfg.ora 文件
HeartBeat=15000
ClusterName=Oracle Cluster Manager, version 9i
PollInterval=1000
MissCount=210
PrivateNodeNames=wanghai1-eth1 wanghai2-eth1
PublicNodeNames=wanghai1 wanghai2
ServicePort=9998
#WatchdogSafetyMargin=5000
#WatchdogTimerMargin=60000
CmDiskFile=/netapp/CmDiskFile
HostName=wanghai1-eth1
KernelModuleName=hangcheck-timer
节点2的cmcfg.ora 文件
HeartBeat=15000
ClusterName=Oracle Cluster Manager, version 9i
PollInterval=1000
MissCount=210
PrivateNodeNames=wanghai1-eth1 wanghai2-eth1
PublicNodeNames=wanghai1 wanghai2
ServicePort=9998
#WatchdogSafetyMargin=5000
#WatchdogTimerMargin=60000
CmDiskFile=/netapp/CmDiskFile
HostName=wanghai2-eth1
KernelModuleName=hangcheck-timer
注释$ORACLE_HOME/oracm/admin/ocmargs.ora中包含watchdogd的行
more $ORACLE_HOME/oracm/admin/ocmargs.ora
# Sample configuration file $ORACLE_HOME/oracm/admin/ocmargs.ora
#watchdogd
oracm
norestart 1800
注释$ORACLE_HOME/oracm/bin/ocmstart.sh中的以下行
# watchdogd's default log file
# WATCHDOGD_LOG_FILE=$ORACLE_HOME/oracm/log/wdd.log
# watchdogd's default backup file
# WATCHDOGD_BAK_FILE=$ORACLE_HOME/oracm/log/wdd.log.bak
# Get arguments
# watchdogd_args=`grep '^watchdogd' $OCMARGS_FILE |# sed -e 's+^watchdogd *++'`
# Check watchdogd's existance
# if watchdogd status | grep 'Watchdog daemon active' /dev/null
# then
# echo 'ocmstart.sh: Error: watchdogd is already running'
# exit 1
# fi
# Backup the old watchdogd log
# if test -r