EMCC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.18.10)(PORT = 1521))
(LOAD_BALANCE = yes)
(FAILOVER = ON)
(CONNECT_DATA =
(SERVICE_NAME = EMCC)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC)(RETRIES = 20)(DELAY = 20))
)
)
EMCCSTBY =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.18.17)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME= EMCCSTBY)
)
)
下面两个TNS EMCCFAN EMCCFO是可行配置,如果切换后需要shutdown 主库 或者利用最下面得trigger
EMCCFAN=
(DESCRIPTION = (LOAD_BALANCE = no)(FAILOVER = yes)(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.18.10)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = EMCC))(ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.18.17)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = EMCCSTBY))
)
#正常RAC 各实例随机连接,不会STBY,switchover 后需要关闭EMCC,否则mount下也是连EMCC
#PLSQL DEV下自动切换select 无报错,直接SQLPLUS不行,报错ORA-03114
EMCCFO=
(DESCRIPTION_LIST=
(FAILOVER=true)
(LOAD_BALANCE=false)
(DESCRIPTION=
(ADDRESS= (PROTOCOL=TCP) (HOST= .10) (PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=EMCC))
)
(DESCRIPTION=
(ADDRESS= (PROTOCOL=TCP) (HOST= .17) (PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=EMCCSTBY))
)
)
#oci 下配???
EMCCOCI=
(DESCRIPTION_LIST=
(LOAD_BALANCE=off) (FAILOVER=on)
(DESCRIPTION= (CONNECT_TIMEOUT=15)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
(ADDRESS_LIST= (LOAD_BALANCE=on)
(ADDRESS= (PROTOCOL=TCP) (HOST= .10) (PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=EMCC)))
(DESCRIPTION= (CONNECT_TIMEOUT=15)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
(ADDRESS_LIST= (LOAD_BALANCE=on)
(ADDRESS= (PROTOCOL=TCP) (HOST= .17) (PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=EMCCSTBY))))
#正常情况RAC得EMCC和 EMCCSTBY 随机连接,不适合
EMCCTAF =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = .10)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = EMCC)
)
)
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = .17)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = EMCCSTBY)
)
)
)
The following two cases of ASAP HA were tested on production, successfully -
Case 1: Node1 - Instance shutdown
ASAP was using Node 1 to connect to DB. On Node 1 instance shutdown, ASAP successfully switched to Node 2 instance. Though ASAP switch-over was not immediate (nearly a minute) but it was successful.
Case 2: Node 2- OS shutdown (Hardware failure case. Also, similar to bringing listener down)
ASAP was using Node 2 to connect to DB. On Node 2 OS shutdown, ASAP successfully switched to Node 1 instance. Though ASAP switch-over was not immediate (nearly a minute) but it was successful.
PS: ASAP is using two VIPs for DB connection -
DB_CONNECT=(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xx.xx.xx.xx)(PORT =
String thinConn = "jdbc:oracle:thin:@( DESCRIPTION = ( ADDRESS_LIST = (ADDRESS = (PROTOCOL=TCP)(HOST=host1)(PORT=1521) )
(ADDRESS = (PROTOCOL=TCP)(HOST=host2)(PORT=1521) )
(LOAD_BALANCE=true)(FAILOVER=true)
)
( ADDRESS_LIST = (ADDRESS = (PROTOCOL=TCP)(HOST=host3)(PORT=1521) )
(ADDRESS = (PROTOCOL=TCP)(HOST=host4)(PORT=1521) )
(LOAD_BALANCE=true)(FAILOVER=true)
)
(CONNECT_DATA=(SERVICE_NAME=orcl)(SERVER=DEDICATED)))\n";
If the database is configured for failover using DataGuard, the failover entries in the tnsnames.ora may look like following:
(DESCRIPTION=
(ADDRESS_LIST=
(ADDRESS=(PROTOCOL=TCP)(HOST=host.domain)(PORT=XXXX))
(ADDRESS=(PROTOCOL=TCP)(HOST=backuphost.domain))(PORT=XXXX)))
(load_balance=no)(failover=on)(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=my_db.domain)))
Notes:
The question is if this DataGuard feature may be used for the FMW Metadata Repository schemas and if so, how?
Concern here is that the ASAP switch-over was not immediate (nearly a minute) but it was successful.
Can we reduce this failover time ?
From the diags, it can be seen that below parameters were set in ASAP.cfg.
DB_RETRY_INTERVAL to 1 sec
DB_RETRY_NUMBER to 5 times
USE_OCI_EVENTS to 1 (enabled)
These are the parameters that a user can use from ASAP application end, and on which the user has control over. Since DB_RETRY_INTERVAL is already set to the minimum value and also USE_OCI_EVENTS is enabled, this makes the failover automatically engaged by the OCI in the presence of a failure (as a result it improves the response time of the application in case of a failure.). There are no any other parameters in the ASAP application which can help further in the RAC DB Failover case.
However, if ASAP processes are taking time to restart after failover, the user may try setting RESTART_DELAY (The time delay, in seconds, that the Control server waits between attempts to restart an ASAP application, Default = 60). Other related parameter is RESTART_ATTEMPTS (The number of times the Control server attempts to restart an ASAP server, Default = 5).
-------------------------------------------------------------------------------------------------------------------
A JDBC/thin application connecting to a Data Guard configuration uses the following connect string:
jdbc:oracle:thin:@(DESCRIPTION = (LOAD_BALANCE = no)(FAILOVER = yes)(ADDRESS = (PROTOCOL = TCP)(HOST = host1)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = serv1))(ADDRESS = (PROTOCOL = TCP)(HOST = host2)(PORT = 1521))(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = serv2)))
In the initial configuration, host1 and serv1 represent the primary host and database service and host2 and serv2 represent the standby host and database service.
At this stage the JDBC connection works fine and connects to the primary database.
After the switchover is completed the 2 databases are now reversed. Host2 and serv2 are the primary and host1 and serv1 are the standby.
The JDBC/thin application can no longer connect due to error: ORA-01033: ORACLE initialization or shutdown in progress
It should be noted that the SQL*Plus connection still works:
sqlplus system/
SQL*Plus: Release 12.1.0.2.0 Production on Wed Mar 16 09:23:34 2016
Copyright (c) 1982, 2014, Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
It should be noted that in this configuration the two databases are using different services: serv1 and serv2. The trigger-based solution presented in this document would not work for a database configuration using a single service name.
The issue is triggered by the Data Guard switchover. During this process, the first database, which was the primary and the second database which was the standby will swap roles.
After the switchover the second database is now the primary and is in OPEN READ WRITE mode. The first database is now the standby and is in MOUNT mode.
Following the Data Guard switchover, both the primary and standby database services are started.
The JDBC driver will keep trying to connect to the first database on the connect string list and hit the ORA-1033 for the standby database with is in MOUNT mode.
This is JDBC/thin specific and not reproducible with SQL*Plus apparently because the Oracle Net is configured with TAF which will ensure that only the primary database is used.
Only run the database service for the primary database. The service for the standby database should be stopped.
This configuration can be automated using AFTER STARTUP ON DATABASE triggers.
这样不行啊,远程readonly 什么得都连接不上了,dataguard redo transfer 会不会出问题? 最好switch over期间 还是shutdown吧
for example:
The database on host1 could be configured with the following trigger:
CREATE TRIGGER CHECK_SERV1_START AFTER STARTUP ON DATABASE
DECLARE
V_ROLE VARCHAR(30);
BEGIN
SELECT DATABASE_ROLE INTO V_ROLE FROM V$DATABASE;
IF V_ROLE = 'PRIMARY' THEN
DBMS_SERVICE.START_SERVICE('SERV1);
ELSE
DBMS_SERVICE.STOP_SERVICE(SERV1');
END IF;
END;
/
The database on host2 could be configured with the following trigger:
CREATE TRIGGER CHECK_SERV2_START AFTER STARTUP ON DATABASE
DECLARE
V_ROLE VARCHAR(30);
BEGIN
SELECT DATABASE_ROLE INTO V_ROLE FROM V$DATABASE;
IF V_ROLE = 'PRIMARY' THEN
DBMS_SERVICE.START_SERVICE('SERV2);
ELSE
DBMS_SERVICE.STOP_SERVICE(SERV2');
END IF;
END;
/
-------------------------------------------------------------------------------------------------------------------
How to configure automatic fail-over for JDBC client?
Configuring Automatic Failover for JDBC Clients
===
Prerequisites:
===
» The Universal Connection Pool (UCP) is enabled (UCP 12.1.0.2 or later).
» The application uses service names to connect to the database.
» Oracle Notification Service (ONS) is configured and available on the node where JDBC is running.
» The Java Virtual Machine (JVM) in which your JDBC instance is running must have oracle.ons.oraclehome set to
point to your ORACLE_HOME.
Configuration:
===
1. Enable Fast Connection Failover (FCF) and configure the JDBC application to connect to all ONS daemons for
both the primary and standby clusters using the setONSConfiguration property. The setONSConfiguration
property should point to all primary and standby ONS daemons. For example:
pds.setONSConfiguration("nodes=
2. By default the JDBC application will randomly pick three hosts from the setONSConfiguration property and create
connections to those three ONS daemons. This default must be changed so that connections are made to all
ONS daemons. This is done by setting the following property when the JDBC application is invoked to the total
number of ONS daemons in the configuration. For example (this will show that the number 4 below is just an example):
java -Doracle.ons.maxconnections=4
3. The JDBC client must set the oracle.net.ns.SQLnetDef.TCP_CONNTIMEOUT_STR property. This property
enables the JDBC client to quickly traverse an ADDRESS_LIST in the event of a failure. For example, if the client
attempts to connect to a host that is unavailable, the connection attempt will be bounded to the time specified by
the SQLnetDef.TCP_CONNTIMEOUT_STR property after which the client attempts to connect to the next host in
the ADDRESS_LIST. The behavior continues for each host in the ADDRESS_LIST until a connection is made.
Setting the property to a value of 3 seconds will suffice in most environments. It is important to note that the
SQLnetDef.TCP_CONNTIMEOUT_STR property should be set on the data source and not on the Universal
Connection Pool.
Properties prop = new Properties();
prop.put(oracle.net.ns.SQLnetDef.TCP_CONNTIMEOUT_STR, ""+3000); // 3000ms
pds.setConnectionProperties(prop);
4. Set the following to get the correct behavior from SCAN load balancing:
// need to set oracle.jdbc.thinForceDNSLoadBalancing
prop.put("oracle.jdbc.thinForceDNSLoadBalancing","true");
5. Configure JDBC clients to use a connect descriptor that includes an address list that in turn includes the SCAN
address for each site and connects to an existing service. Do not configure both TAF and JDBC FCF when using
JDBC thick clients.
The following URL searches both primary and standby sites looking for the appropriate service with very little
overhead. This URL configuration is the recommended approach:
PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
pds.setUser("
pds.setPassword("
String dbURL =
"jdbc:oracle:thin:@" +
"(DESCRIPTION=" +
"(FAILOVER=on)" +
"(ADDRESS_LIST=" +
"(LOAD_BALANCE=on)" +
"(CONNECT_TIMEOUT=3)(RETRY_COUNT=3)" +
"(ADDRESS=(PROTOCOL=TCP)(HOST=
"(ADDRESS=(PROTOCOL=TCP)(HOST=
"(CONNECT_DATA=(SERVICE_NAME=
System.out.println("Url=" + dbURL);
pds.setURL(dbURL);
The following URL should be used if it is very rare for the primary to ever run on the secondary site and you wish
to have connections connect as fast as possible:PoolDataSource pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName("oracle.jdbc.pool.OracleDataSource");
pds.setUser("
pds.setPassword("
String dbURL =
"jdbc:oracle:thin:@" +
"(DESCRIPTION_LIST=" +
“(LOAD_BALANCE=off)” +
"(FAILOVER=on)" +
"(DESCRIPTION=" +
"(CONNECT_TIMEOUT=3)(RETRY_COUNT=3)" +
"(ADDRESS_LIST=" +
"(LOAD_BALANCE=on)" +
"(ADDRESS=(PROTOCOL=TCP)(HOST=
"(CONNECT_DATA=(SERVICE_NAME=
"(DESCRIPTION=" +
"(ADDRESS_LIST=" +
"(LOAD_BALANCE=on)" +
"(ADDRESS=(PROTOCOL=TCP)(HOST=
"(CONNECT_DATA=(SERVICE_NAME=
System.out.println("Url=" + dbURL);
pds.setURL(dbURL);
Note that if a switchover or failover occurs the above URL will force all connections to go through the old
before using the
When a new connection is made using the above URL’s the following logic is used:
» Oracle Net contacts DNS and resolves
» Oracle Net randomly picks one of the three IP address and attempts to make a connection. All three IP addresses will be tried a total of four times (initial attempt plus RETRY_COUNT in the above example).
» If the connection to primary site is unsuccessful, it then contacts DNS and resolves
» The same sequence is performed for the standby
The following list provides additional information about the Oracle Net parameters used in the above alias:
» LOAD_BALANCE is ON by default for DESCRIPTION_LIST only. This parameter by default is OFF for an address list within a DESCRIPTION. Setting this ON for a SCAN-based address implies that new connections will be randomly assigned to one of the 3 SCAN-based IP addresses resolved by DNS.
» The default value for the FAILOVER parameter is ON for an address list within a DESCRIPTION. This impacts the 3 SCAN IP addresses the same way as if those 3 IP addresses were listed explicitly in the connect descriptor.
This means that if the initial connection requests to the first randomly-assigned SCAN IP address fails, the connection will failover to another SCAN IP address, and will continue to do so, till it iterates the complete address list. Note that this parameter is relevant only to new connections. Failover of existing connections is handled by TAF, which is controlled by the separate FAILOVER_MODE parameter.
» The CONNECT_TIMEOUT parameter is the time to connect to the database instance providing the requested service, and includes the time to establish a TCP connection to the listener. The timeout interval is applicable for each ADDRESS in an ADDRESS_LIST, and each IP address to which a host name is mapped. Set the CONNECT_TIMEOUT parameter to the maximum amount of time (in seconds) to wait for a response from an address before skipping to the next address. A setting of three seconds is recommended and is acceptable in
most cases. Do not set this parameter to fewer than three seconds.
The equivalent global parameter in sqlnet.ora is SQLNET.OUTBOUND_CONNECT_TIMEOUT. If the same timeout value is sufficient for all connect strings, it would be simpler to set the global parameter. Otherwise, a separate setting can be done for each connect string.
» The RETRY_COUNT parameter specifies the number of times an address list is traversed before the new connection attempt is terminated. The default value is 0. With respect to SCAN, with FAILOVER = on, setting this RETRY_COUNT parameter to a value of 2, for example, means the three SCAN IP addresses are traversed thrice (i.e. 3*3=9 connect attempts), before the connection is terminated
» When the connection request initially comes in, the first randomly assigned IP address tries to service that request, followed by the two remaining IP addresses. (This behavior is controlled by the FAILOVER parameter.)
» The retries then kick in and the list of three IP addresses is tried two more times. RETRY_COUNT is only supported at DESCRIPTION level in connect string, but not at global (i.e. sqlnet.ora) level.
Information taken from the next document:
o https://www.oracle.com/technetwork/database/availability/client-failover-2280805.pdf
Client Failover Best Practices
for Highly Available Oracle Databases