Fast-Start Failover in Oracle 11g
Fast Start Failover is a feature available through oracle dataguard broker which performs an automatic failover to the chosen standby database in case of a primary database crash.
FSFO involves an OBSERVER component which runs on a different machine than that of the primary and the standby database. The observer hardly consumes any resource and requires only a client or database software installed in it’s site and TNS connectivity established to the primary and the standby database. As the name suggests, the observer monitors the availability of the primary database and initiates a fast start failover when the primary database loses the connectivity with the observer and the chosen standby database target.
Before we move with configuring FSFO, there are some required FSFO properties that we need to know of. These include:
FastStartFailoverLaglimit: With FSFO, we can use only MaxAvailability or MaxPerformance mode. When in MaxPerformance mode, this property specifies the maximum amount of data loss that is permissible. This value is in “SECONDS” and the default value is 30 seconds. When in MaxAvailability mode, FSFO ensures that there is no data loss.
FastStartFailoverThreshold: This property specifies the number of seconds that the observer and the target standby database will wait before initiating the failover. By default, the value of this property is 30 seconds.
FastStartFailoverPmyShutdown: If this property is set to true and the primary database is stalled for more than FastStartFailoverThreshold seconds, then the primary database will shutdown.
FastStartFailoverTarget: This property specifies the db_unique_name of the database which would be the target database of the fast start failover.
FastStartFailoverAutoReinstate: If set to true, then the former primary database will be reinstated after the fast-start failover.
On another note, Fast-Start-Failover has a few restrictions:
1. You cannot change the protection mode of the dataguard broker configuration and nor can you change the log shipping mode (logxptmode) of the primary and the target standby database.
2. You cannot perform a switchover or a failover to a standby database which is not the “fast-start-failover target”.
3. The broker configuration cannot be removed if FSFO is eanbled and nor can the target standby database be deleted.
4. FSFO is not possible if the primary database was shutdown without the “abort” option or if the observer is not running.
Now let’s move on with the steps involved in enabling FSFO. This post assumes that a broker configuration already exists it’s creation is not outlined here.
The steps required to configure dataguard broker can be referred here https://shivanandarao-oracle.com/2013/07/10/data-guard-broker-configuration/
The configuration name used here is “dgtest” and uses the MaxAvailability mode.
Environment:
Primary Site:
DB Name : srprim DB Unique Name : srprim Connect Identifier : srprim Hostname : ora1-1
Standby Site:
DB Name : srprim DB Unique Name : srpstb Connect Identifier : srpstb Hostname : ora1-2
Observer Site:
Hostname : ora1-3
The configuration is enabled and it’s status is as below.
DGMGRL> show configuration; Configuration - dgtest Protection Mode: MaxAvailability Databases: srprim - Primary database srpstb - Physical standby database Fast-Start Failover: DISABLED Configuration Status: SUCCESS
Make sure that both, primary and the standby databases have the flashback and FRA featured enabled which forms a pre-requisite for FSFO.
Now, let’s try to enable to “FSFO”.
DGMGRL> ENABLE FAST_START FAILOVER; Error: ORA-16651: requirements not met for enabling fast-start failover Failed.
This of-course fails. Reason — there are certain pre-requiste properties mentioned earlier which needs to be set.
Set the FastStartFailoverTarget to the target database (srpstb) for the primary database (srprim)
DGMGRL> edit database srprim set property 'FastStartFailoverTarget'='srpstb'; Property "FastStartFailoverTarget" updated DGMGRL>
Similarly, set the FastStartFailoverTarget to the target database (srprim) for the database srpstb. This will be used if srpstb starts behaving as a primary database and srprim as a standby after the role transition.
DGMGRL> edit database srpstb set property 'FastStartFailoverTarget'='srprim'; Property "FastStartFailoverTarget" updated DGMGRL>
Verfiy the property set above.
DGMGRL> show database srprim 'FastStartFailoverTarget'; FastStartFailoverTarget = 'srpstb'
Now set the property FastStartFailoverThreshold to 60 seconds which will be time in seconds that the observer will wait before initiating the failover.
DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold = 60; Property "faststartfailoverthreshold" updated
Observer process is a continuous process and does not return the prompt at the DGMGRL session until you stop the observer from another DGMGRL session.
Due to this, it’s preferred to run the observer in background using the “nohup”.
In order to run the observer in the background, just connect to the broker configuration from the observer site and run the “start observer” command.
[oracle@ora1-3 ~]$ nohup dgmgrl sys/oracle@srprim "start observer" &
If willing to run in the foreground, then connect to the broker configuration from the observer site and run the “start observer” command to start the observer.
[oracle@ora1-3 ~]$ dgmgrl sys/oracle@srprim DGMGRL for Linux: Version 11.2.0.2.0 - 64bit Production Copyright (c) 2000, 2009, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. Connected. DGMGRL> start observer; Observer started
Once that the observer is running, let me enable FSFO. To do so, connect to the broker configuration and execute “enable fast_start failover” command.
DGMGRL> enable fast_start failover; Enabled. DGMGRL>
DGMGRL> show configuration; Configuration - dgtest Protection Mode: MaxAvailability Databases: srprim - Primary database srpstb - (*) Physical standby database Fast-Start Failover: ENABLED Configuration Status: SUCCESS
It’s clear from the above outcome that FSFO is enabled.
Let’s get the details of the properties set for FSFO.
DGMGRL> show fast_start failover; Fast-Start Failover: ENABLED Threshold: 60 seconds Target: srpstb Observer: ora1-3.mydomain Lag Limit: 30 seconds (not in use) Shutdown Primary: TRUE Auto-reinstate: TRUE Configurable Failover Conditions Health Conditions: Corrupted Controlfile YES Corrupted Dictionary YES Inaccessible Logfile NO Stuck Archiver NO Datafile Offline YES Oracle Error Conditions: (none)
It can be noticed that FastStartFailoverThreshold is set to 60 seconds with FastStartFailoverTarget as SRPSTB, Observer being running at “ora1-3” host, FastStartFailoverLaglimit with the default value (30 seconds) which is currently not used in this configuration as the protection mode is set to MaxAvailability and most importantly FastStartFailoverPmyShutdown and FastStartFailoverAutoReinstate being set to TRUE.
FSFO can also be triggered with certain additional (optional) conditions. There has been no conditions specified from my end. So the values you see above are the default ones.
FSFO will occur if the Controlfile is corrupted or if the dictionary is corrupted or if a datafile is offline due to write error. In addition, you can configure other conditions such as Struck archiver (FSFO will occur if archive process is unable to archive the redo due to write error or the disk being full), Inaccessible logfile (LGWR is unable to write to the redo logs due to write error).
Query the primary database to check the FSFO and the observer status.
SQL> select FS_FAILOVER_STATUS,FS_FAILOVER_OBSERVER_PRESENT from v$database; FS_FAILOVER_STATUS FS_FAIL ---------------------- ------- SYNCHRONIZED YES
To simulate a fast start failover, I crash the primary database by shutting it down using the “Abort” clause so that it looses the connectivity with the observer and the standby database.
SQL> shut abort ORACLE instance shut down.
Now on checking the status of the configuration, Oracle throws out the error message that the primary database is unavailable and cannot determine it’s status.
DGMGRL> show configuration; Configuration - dgtest Protection Mode: MaxAvailability Databases: srprim - Primary database srpstb - (*) Physical standby database Fast-Start Failover: ENABLED Configuration Status: ORA-01034: ORACLE not available ORA-16625: cannot reach database "srprim" DGM-17017: unable to determine configuration status
The observer did not immediately trigger the failover as it has to wait for the FastStartFailoverThreshold number of seconds.
After a few seconds of wait, check the configuration status again. FSFO is in progress as seen below.
DGMGRL> show configuration; Configuration - dgtest Protection Mode: MaxAvailability Databases: srpstb - Primary database srprim - (*) Physical standby database (disabled) Fast-Start Failover: ENABLED Configuration Status: ORA-16610: command "FAILOVER TO srpstb" in progress DGM-17017: unable to determine configuration status
It’s good to see below that fast start failover has occured and “srpstb” is the primary database.
DGMGRL> show configuration; Configuration - dgtest Protection Mode: MaxAvailability Databases: srpstb - Primary database Warning: ORA-16817: unsynchronized fast-start failover configuration srprim - (*) Physical standby database (disabled) ORA-16661: the standby database needs to be reinstated Fast-Start Failover: ENABLED Configuration Status: WARNING
Reinstating Former Primary database as New Standby database:
Now, that failover occurred, I’d like to reinstate the former primary database “srprim”. Since FastStartFailoverAutoReinstate was set to True, the observer will
reinstate the former primary database automatically once it is up.
But the below portion is a snippet of reinstating a former primary database as a new standby database if in case FastStartFailoverAutoReinstate was set to FALSE .
Connect to the former primary database “srprim” and mount it up.
[oracle@ora1-1 ~]$ sqlplus sys/oracle@srprim as sysdba SQL*Plus: Release 11.2.0.2.0 Production on Thu Oct 15 19:30:46 2015 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to an idle instance. SQL> startup mount ORACLE instance started. Total System Global Area 943669248 bytes Fixed Size 2232128 bytes Variable Size 641728704 bytes Database Buffers 293601280 bytes Redo Buffers 6107136 bytes Database mounted.
Now connect to the broker configuration with the new primary database and execute “reinstate database ‘<former primary database>'”.
In my case it is : “reinstate databsae ‘srprim’;”
[oracle@ora1-2 ~]$ dgmgrl sys/oracle@srpstb DGMGRL for Linux: Version 11.2.0.2.0 - 64bit Production Copyright (c) 2000, 2009, Oracle. All rights reserved. Welcome to DGMGRL, type "help" for information. Connected. DGMGRL> reinstate database 'srprim'; Reinstating database "srprim", please wait... Reinstatement of database "srprim" succeeded DGMGRL>
Let’s check the configuration.
DGMGRL> show configuration; Configuration - dgtest Protection Mode: MaxAvailability Databases: srpstb - Primary database srprim - (*) Physical standby database Fast-Start Failover: ENABLED Configuration Status: SUCCESS DGMGRL>
Good to see that srprim has been converted to the new standby database.
COPYRIGHT
© Shivananda Rao P, 2012 to 2018. Unauthorized use and/or duplication of this material without express and written permission from this blog’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Shivananda Rao and http://www.shivanandarao-oracle.com with appropriate and specific direction to the original content.
DISCLAIMER
The views expressed here are my own and do not necessarily reflect the views of any other individual, business entity, or organization. The views expressed by visitors on this blog are theirs solely and may not reflect mine.
Leave a Reply