Oracle RAC 12CR2添加新节点

修改记录

列出最近扩展节点遇到的一些问题。见文章最后。

1、扩展完节点之后,实例启动不成功,报ORA-01618: redo thread 3 is not enabled - cannot mount

2、扩展完节点之后,实例启动不成功,报ORA-30013: undo tablespace 'UNDOTBS1' is currently in use

网段声明

在之前安装的rac基础上添加新节点,启动node01、node02为原rac,node03为新添加节点

Name node01 node02 node03
Public IP 192.168.1.14 192.168.1.15 192.168.1.16
Private IP 192.168.6.2 192.168.6.3 192.168.6.4
Virtual IP 192.168.1.17 192.168.1.18 192.168.1.22
SCAN IP 192.168.1.19 192.168.1.120 192.168.1.21
NET IP 192.168.2.95 192.168.2.96 192.168.2.55

准备

参考hwcloud上的Oracle RAC 12cR2安装手册(1)—环境准备做如下准备:

  • IP分配
  • 安全组
  • 修改主机名
  • 配置host解析
  • 关闭libvirt
  • 禁用ZEROCONF
  • 安装依赖
  • 配置swap
  • 配置内核
  • 关闭NTP
  • 配置PAM
  • 修改limit文件
  • 关闭SELinux
  • 创建用户,用户组和文件夹
  • 配置grid和oracle用户的环境变量
  • 配置ssh互信
  • 挂载共享存储到node03上,安装asmlib,使用oracleasm scandisks扫描磁盘组

安装Grid

环境检查

node01上,grid用户,执行如下命令进行node03的环境检查

1
cluvfy stage -pre nodeadd -n node03 -fixup -verbose

执行结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
CVU operation performed: stage -pre nodeadd
Date: Dec 1, 2017 3:40:21 PM
CVU home: /u01/app/12.2.0/grid/
User: grid
******************************************************************************************
Following is the list of fixable prerequisites selected to fix in this session
******************************************************************************************
-------------- --------------- ----------------
Check failed. Failed on nodes Reboot required?
-------------- --------------- ----------------
Package: cvuqdisk-1.0.10-1 node03 no
Execute "/tmp/CVU_12.2.0.1.0_grid/runfixup.sh" as root user on nodes "node03" to perform the fix up operations manually
Press ENTER key to continue after execution of "/tmp/CVU_12.2.0.1.0_grid/runfixup.sh" has completed on nodes "node03"

根据提示,在node03上用root用户执行修复脚本,结果如下:

1
2
3
4
5
6
7
8
Press ENTER key to continue after execution of "/tmp/CVU_12.2.0.1.0_grid/runfixup.sh" has completed on nodes "node03"
Fix: Package: cvuqdisk-1.0.10-1
Node Name Status
------------------------------------ ------------------------
node03 successful
Result:
"Package: cvuqdisk-1.0.10-1" was successfully fixed on all the applicable nodes
[grid@node01 ~]$

临时禁用dns解析

由于grid安装时会校验dns中是否能检查到节点信息,图形化安装时可以手动忽略,这里静默添加,不能忽略,会报错退出,如下:

1
2
3
4
5
6
7
INFO: [Dec 1, 2017 4:46:51 PM] resolv.conf Integrity: resolv.conf Integrity
INFO: [Dec 1, 2017 4:46:51 PM] Severity:CRITICAL
INFO: [Dec 1, 2017 4:46:51 PM] OverallStatus:VERIFICATION_FAILED
INFO: [Dec 1, 2017 4:46:51 PM] *********************************************
INFO: [Dec 1, 2017 4:46:51 PM] (Linux) resolv.conf Integrity: This task checks consistency of file /etc/resolv.conf file across nodes
INFO: [Dec 1, 2017 4:46:51 PM] Severity:CRITICAL
INFO: [Dec 1, 2017 4:46:51 PM] OverallStatus:VERIFICATION_FAILED

所以临时禁用,如下:

node01, node02, node03节点,root用户

1
mv /etc/resolv.conf /etc/resolv.conf.bak

安装

node01上,grid用户,执行如下命令:

1
2
cd $ORACLE_HOME/addnode
./addnode.sh -silent "CLUSTER_NEW_NODES={node03}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={node03-vip.myCluster.com}" "CLUSTER_NEW_PUBLIC_HOSTNAMES={node03.myCluster.com}" "CLUSTER_NEW_NODE_ROLES={HUB}"

注:

以上参数都参考图形化安装时的配置

结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
As a root user, execute the following script(s):
1. /u01/app/oraInventory/orainstRoot.sh
2. /u01/app/12.2.0/grid/root.sh
Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:
[node03]
Execute /u01/app/12.2.0/grid/root.sh on the following nodes:
[node03]
The scripts can be executed in parallel on all the nodes.
.................................................. 100% Done.
Successfully Setup Software.

根据提示,用root用户在node03执行/u01/app/oraInventory/orainstRoot.sh/u01/app/12.2.0/grid/root.sh即可

安装oracle软件

node01上,oracle用户,执行如下命令:

1
2
cd $ORACLE_HOME/addnode
./addnode.sh -silent "CLUSTER_NEW_NODES={node03}"

结果如下:

1
2
3
4
5
6
7
8
9
As a root user, execute the following script(s):
1. /u01/app/oracle/product/12.2.0/dbhome_1/root.sh
Execute /u01/app/oracle/product/12.2.0/dbhome_1/root.sh on the following nodes:
[node03]
.................................................. 100% Done.
Successfully Setup Software.

根据提示,使用root用户,在node03上执行/u01/app/oracle/product/12.2.0/dbhome_1/root.sh即可

安装instance

node01上,oracle用户,执行如下命令:

1
dbca -silent -addInstance -nodeName node03 -gdbName orcl.myCluster.com -instanceName orcl3 -sysDBAUserName sys -sysDBAPassword Hand1234

结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[oracle@node01 ~]$ dbca -silent -addInstance -nodeName node03 -gdbName orcl.myCluster.com -instanceName orcl3 -sysDBAUserName sys -sysDBAPassword Hand1234
Adding instance
1% complete
2% complete
6% complete
13% complete
20% complete
26% complete
33% complete
40% complete
46% complete
53% complete
66% complete
Completing instance management.
76% complete
100% complete
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/orcl/orcl.log" for further details.

注:

  • 安装过程中,出现找不到磁盘组的情况
  • 后把u01文件夹按之前的授权重新授权,重启服务器出现原来的RAC无法启动的问题,报错为CRS-0184 Cannot Communicate With The Crs Daemon in Oracle
  • 通过在三个节点(其实只需在无法启动的节点)执行安装grid时和oracle生成的root.sh文件之后,能够正常启动数据库
  • 安装第三个节点的实例能找到ASM磁盘

验证

集群层

  • crsctl check crs
  • crsctl check cluster -all
  • crsctl status res -t

应用层

  • srvctl status nodeapps
  • srvctl status asm
  • srvctl status listener
  • srvctl status instance -d orcl -i orcl1,orcl2,orcl3

最后

记得恢复/etc/resolv.conf !!!

问题总结

ORA-01618: redo thread 3 is not enabled - cannot mount

在可以使用的节点上检查redo日志组
1
2
3
4
5
6
7
8
SQL> set lines 200
SQL> select * from v$Log;
GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARCHIVED STATUS FIRST_CHANGE# FIRST_TIME NEXT_CHANGE# NEXT_TIME CON_ID
---------- ---------- ---------- ---------- ---------- ---------- --------- ------------------------------------------------ ------------- --------------- ------------ --------------- ----------
1 1 4429 209715200 512 1 NO CURRENT 3711801002 21-MAR-18 1.8447E+19 0
2 1 4428 209715200 512 1 NO ACTIVE 3711526576 21-MAR-18 3711801002 21-MAR-18 0
3 2 4623 209715200 512 1 NO CURRENT 3711753735 21-MAR-18 1.8447E+19 0
4 2 4622 209715200 512 1 NO INACTIVE 3711406786 21-MAR-18 3711753735 21-MAR-18 0
在可以使用的节点上添加thread 3日志
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SQL> alter database add logfile thread 3('+DATA','+DATA') size 200m;
Database altered.
SQL> alter database add logfile thread 3('+DATA','+DATA') size 200m;
Database altered.
SQL> select * from v$Log;
GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARCHIVED STATUS FIRST_CHANGE# FIRST_TIME NEXT_CHANGE# NEXT_TIME CON_ID
---------- ---------- ---------- ---------- ---------- ---------- --------- ------------------------------------------------ ------------- --------------- ------------ --------------- ----------
1 1 4429 209715200 512 1 NO CURRENT 3711801002 21-MAR-18 1.8447E+19 0
2 1 4428 209715200 512 1 NO INACTIVE 3711526576 21-MAR-18 3711801002 21-MAR-18 0
3 2 4623 209715200 512 1 NO CURRENT 3711753735 21-MAR-18 1.8447E+19 0
4 2 4622 209715200 512 1 NO INACTIVE 3711406786 21-MAR-18 3711753735 21-MAR-18 0
5 3 0 209715200 512 2 YES UNUSED 0 0 0
6 3 0 209715200 512 2 YES UNUSED 0 0 0
6 rows selected.
SQL> alter database enable thread 3;
Database altered.

ORA-30013: undo tablespace ‘UNDOTBS1’ is currently in use

在可用节点查看现有undo表空间
1
2
3
4
5
6
select tablespace_name,
file_name,
AUTOEXTENSIBLE,
bytes / 1024 / 1024 size_mb
from dba_data_files
where tablespace_name like '%UNDO%';
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
TABLESPACE_NAME
------------------------------------------------------------------------------------------
FILE_NAME
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AUTOEXTEN SIZE_MB
--------- ----------
UNDOTBS1
+DATA/SRMCLOUD/DATAFILE/undotbs1.278.963232783
YES 3805
UNDOTBS2
+DATA/SRMCLOUD/DATAFILE/undotbs2.273.963232811
YES 3075
TABLESPACE_NAME
------------------------------------------------------------------------------------------
FILE_NAME
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
AUTOEXTEN SIZE_MB
--------- ----------
UNDOTBS3
+DATA/SRMCLOUD/DATAFILE/undotbs3.579.971361293
YES 3805

发现是有undotbs3的表空间,因此只需要修改第三节点的undo即可

在可用节点修改第三节点的undo表空间
1
2
3
SQL> alter system set undo_tablespace='UNDOTBS3' sid='srmcloud3';
System altered.
SQL>