Fixing SAP Connectivity issue after AOAG Automatic failover
We have Alwayson Availability Group configuration for our SAP systems. We faced SAP connectivity issue when application team tried to do a failover/HA testing and connect to database post failover, they were failed to connect to the current primary replica which was secondary replica before failover.
Automatic Failover Worked Well but Application failed to connect
I have an AOAG configuration with Automatic failover between two replicas in my environment. Last week, we had High Availability testing for all applications. Everything was working fine if we were doing failover from Management Studio. Databases were failing over and failing back successfully.
Issue occurred when application team tried to connect to database server after failover. They were failed to connect to the secondary replica after every failover but applications were working fine from primary replica. Below are the error details that are coming to application team during establishing database connection post failover to current primary replica (earlier secondary replica).
[Microsoft][ODBC Driver 11 for SQL Server][SQL Server]Invalid object name ‘SVERS’.
When we saw at the database level, everything was working fine. Secondary replica was successfully transitioning as primary replica after failover and databases were accessible from failover node but application was not able to connect to this replica. We checked all respective logins associated with the application and everything was there on secondary replica.
One thing we have noticed during troubleshooting that SQL login was became orphaned after each failover and we did resolved orphaned user issue manually post each failover. But there is no benefit of automatic failover if you need to establish database connection manually after each failover.
After spending some time, we fixed this issue. Application was not able to connect to database post failover because one of the SQL login which was used by application has different SID on secondary replica than its primary replica. I recreated this login on secondary replica with same SID which was in primary replica and this has fixed the issue. Here, I will explain step by step method to make you understand the issue and its solution.
Resync the SID from Primary Replica
First check the current configuration of this AOAG. Run below command to get the AOAG details. We can see the configuration is running into automatic failover mode along with replica names.
SELECT replica_server_name, availability_mode_desc, failover_mode_desc
If you do failover between replicas then it will not throw any error in SSMS or database level. But if you will ask application team to connect to database server using Listener post failover, you will get error.
You will face error because SID of the login used by application is different on both replica servers. You can also get the error when the login used by application to establish database connection is not present on secondary replica. You need to sync the login, password and SID of this login to the secondary replica. Follow this tip to fix the issue that are coming due to any of the given reason.
The SID identifies the security context of the login and is unique within the server instance. If the login is created from a Windows user or group, it is given the Windows SID of the source principle; the Windows SID is unique within the domain. If the login is created as a legacy-style SQL Server login that requires a password, the server will generate a SID. Now check the SID of the login which is used to connect to database on primary replica.
--Replace your identified login with xyz. select name,sid from syslogins where name='xyz'
Now run same command on secondary replica. You can see SID is different on both replicas for same login. Ideally everything should be same and in sync with primary replica that are associated with AOAG.
Now we need to delete this login from secondary replica and recreate it using same SID which is present on primary replica. Next, drop the login either using SSMS or T-SQL. SSMS way is given below:
- Connect to target SQL Server Instance.
- Expand the security folder.
- Expand the Logins folder.
- Right click and choose delete on the identified login which needs to be deleted.
- Click on Ok button of the login deletion window.
If you want to drop this login using T-SQL command then open a query window and execute below command.
--change the name of xyz with your login name which you want to delete. DROP Login 'xyz'
You might get multiple dependency errors during dropping this login because there is possibility, this login might own few objects, agent jobs or databases on this instance. I got Error 15170 during dropping this login which i fixed in attached tip to remove this login. You can also read below articles if you are getting different kind of DROP Login failed issues:
- How to fix Error 15174: Login owns one or more databases.
- How to fix Error 15173: Revoke the permissions before dropping the login.
- How to fix error 15138 & Error 3729: DROP login failed
- How to fix error 15141: The server principal owns one or more endpoint(s) and cannot be dropped
- How to fix error 15434: Could not drop login as the user is currently logged in
Once login will be dropped, go ahead and create same login with same password and SID which are used on primary replica. Copy the SID from image 2 when you check the SID for this login on primary replica and use that during CREATE Login statement. If you don’t have this login on secondary replica yet then also you can follow same process to create the login.
--copy the SID from primary replica. CREATE Login xyz WITH password = 'password@123', CHECK_Policy=Off, SID = 0xC2Cabcdefghijk123hghj6C3
We can see command executed successfully. Now issue will be fixed if you don’t had this login on secondary replica and you just created it for very first time. But you might get orphaned user issue if you had this login earlier with different SID on this instance. Now we need to fix the orphaned users on the secondary replica.
Fix Orphaned Users
Now you need to run below command to check the orphaned users on your database. Make sure that your database is working as primary otherwise you would not be able to run this command. You can failover and then run this command if database will become primary.
--change the dbname USE DBNAME Go sp_change_users_login 'report'
If you will get your user name which was mapped to the identified login then we need to fix this orphaned user. Run below command to fix this issue.
--change the dbname --Map database user xyz to login xyz. USE DBName; GO EXEC sp_change_users_login 'Update_One', 'xyz', 'xyz'; GO
Now orphaned user issue is also resolved and we have fixed the application connectivity issue that was reported after automatic failover of AOAG. You can do failover testing again to make sure application connectivity is working fine after AOAG failover.
- How to Fix SQL Error 1005: A Comprehensive Guide - April 9, 2023
- How to Fix SQL Server Error 207 – Invalid Column Name - April 9, 2023
- How to Fix SQL Error 1045: Resolving Access Denied Issues - April 8, 2023