Thursday, January 15, 2015

PowerShell - Automatic Reboot AlwaysOn Server if AlwaysOn Availability group is ready to failover

Recently, we have encountered Secondary replica was in Resolving State and took 3 minutes to take over primary role when we forced reboot primary server from Task scheduler.

I have read the following troubleshooting page from MS http://support.microsoft.com/kb/2833707. There were 3 cases that can cause secondary replica in RESOLVING state. In Case 3, 

"... In order to automatically fail over, all availability databases that are defined in the availability group must be in a SYNCHRONIZED state between the primary replica and the secondary replica. When an automatic failover occurs, this synchronization condition must be met in order to make sure that there is no data loss. Therefore, if one availability database in the availability group in the synchronizing or not synchronized state, automatic failover will not successfully transition the secondary replica into the primary role...."

As we decided we will continue to schedule to reboot the servers on monthly basis scheduled reboot, we will need to make sure AG group is in Synchronized or fail over ready state.

To do so, I created a powershell script based on the following query. The query will check if AG group is ready to fail over.

Select database_name, is_failover_ready from sys.dm_hadr_database_replica_cluster_states where replica_id in (select replica_id from sys.dm_hadr_availability_replica_states)

In my script, I verified Secondary SQL server status in case they are paused/stopped. 
Enjoy script....

## The following PS script check the other replica is ready for failover. If it is ready, the current replica will be rebooted.
## The script can be used for rebooting primary replica or secondary replica automaically. It can be scheduled in Task Scheduler.

##Other Replica Node

$OtherReplica ='Server2'

#Local Host/Server which is going to reboot
$CurrentReplica='Server1'
#$env:COMPUTERNAME

#First, Check Other Replica is online by Pinging the server wait for 4 replies.
if(Test-Connection -ComputerName $OtherReplica -count 4 -ea 0 -Quiet)
 {
       
       # If the other server replies ping, check the status of SQL server
             
       Write 'The Other Server is alive'

       # Get SQL server status of the other server
       $SQLserverState = Get-WmiObject win32_service -ComputerName $OtherReplica -Filter "Name = 'MSSQLSERVER'"|Select State
                   
         If($SQLserverState.State -eq 'Stopped')
         {
                   Write 'Do Not Reboot' -ea Stop
         }
         
         If($SQLserverState.State -eq 'Paused')
          {
                     Write 'Do Not Reboot' -ea Stop
          }

        #If SQL server status of the other server is running, get the failover status from database
        If($SQLserverState.State -eq 'Running')
         {
                           
                    
                   $DBconn = new-object System.Data.SqlClient.SqlConnection("server=$OtherReplica;Trusted_Connection=true");
                   $DBQry  = 'Select min(cast(is_failover_ready as int)) from sys.dm_hadr_database_replica_cluster_states where replica_id in (select replica_id from sys.dm_hadr_availability_replica_states)'
                   $DBconn.Open()
                   $DBcmd = new-object System.Data.SqlClient.SqlCommand ($DBQry, $DBconn);
                   $Reader = $DBcmd.ExecuteReader()
                   while($Reader.Read()){
                   $FailOverStatus = $Reader.GetValue($1)           
                   }
                   
                   # If $FailOverStatus is 1, it is ready to failover. If $FailOverSatus is 0, it is not ready to failover
                
                   If($FailOverStatus -eq 1)
                   {

                    Write 'Server is Ready to Reboot...Rebooting...'
                    Restart-Computer -ComputerName $CurrentReplica -Force
                   }
                     If($FailOverStatus -eq 0)
                   {

                    Write 'Do Not Reboot' -ea Stop
                   }

                   $DBconn.Close()
           }

}
  else
  {
       Write 'Server is offline'
  }


2 comments:

  1. Many thanks for the post. It's very helpful. I've changed few things for myself:
    1. Invoke-Sqlcmd -ServerInstance $OtherReplica -Database master -Query $DBQry saves few lines of code instead of $DBcmd = new-object System.Data.SqlClient.SqlCommand ($DBQry, $DBconn);
    2. (Get-Service -Name "MSSQLSERVER" -ComputerName $OtherReplica).Status instead of Get-WmiObject win32_service -ComputerName $OtherReplica -Filter "Name = 'MSSQLSERVER'"|Select State

    ReplyDelete

How to add a Database to AlwaysOn Availability Group with four different options

To add a database to an existing AlwaysOn availability group, MS has given us four options to choose from Automatic seeding Full database an...