Configure multiple-subnet Always On Availability Groups and failover cluster instances by modifying CIB


In the Windows world, a Windows Server Failover Cluster (WSFC) natively supports multiple subnets and handles multiple IP addresses via an OR dependency on the IP address. On Linux, there is no OR dependency, but there is a way to achieve a proper multi-subnet natively with Pacemaker, as shown by the following. You cannot do this by simply using the normal Pacemaker command line to modify a resource. You need to modify the cluster information base (CIB). The CIB is an XML file with the Pacemaker configuration.

Reference:https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-configure-multiple-subnet?view=sql-server-ver15

Here is an example to create a SQL Server Linux Availability group in 4 nodes in 3 subnets in RHEL 7.6

 

If you are already familiar with the AG Group setup process, please just jump to step 16.


1.Register your subscription on for all servers (red1,red2,red3 and red4 in this case)

subscription-manager register

 

2.List all available subscription, pick the one with High Availabiilty , notedown the pool id

subscription-manager list --available --all

 

3.Register the subscription for all nodes (red1,red2,red3 and red4 in this case)

sudo subscription-manager attach  --pool=xxxxx

 

4.Enable the repository(red1,red2,red3 and red4 in this case)

sudo subscription-manager repos --enable=rhel-ha-for-rhel-7-server-rpms

 

5.Install Pacemaker packages on all nodes. (red1,red2,red3 and red4 in this case)

sudo yum install pacemaker pcs fence-agents-all resource-agents

 

6.Install SQL Server resource agent (red1,red2,red3 and red4 in this case)

sudo yum install mssql-server-ha

 

7. Set the password for the default user that is created when installing Pacemaker and Corosync packages. All the password should be exactly same (red1,red2,red3 and red4 in this case)

sudo passwd hacluster

8.Update /etc/hosts file in all servers, add IP and node name. All the servers should have the same entries. 

9. Run following commands to Enable and start pcsd service and Pacemaker in all nodes. (red1,red2 and red3 and red4  in this case)

sudo systemctl enable pcsd

sudo systemctl start pcsd

sudo systemctl enable pacemaker

 You also need to expose all related ports before running step 10.

10.Run following commands to Create Cluster in primary replica node (red1 in this case)

sudo pcs cluster auth red1 red2 red3 red4 -u hacluster -p YouPasswordUsedinStep7

sudo pcs cluster setup --name sqlcluster1  red1 red2  red3 red4

sudo pcs cluster start --all

sudo pcs cluster enable --all

 

11. Run following command to Enable cluster feature in all nodes(red1,red2 , red3 and red4 in this case)

sudo /opt/mssql/bin/mssql-conf set hadr.hadrenabled  1

sudo systemctl restart mssql-server

Create AG and Listener

1.Run following queries in red1 to create certificate

use master

go

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**';

go

CREATE CERTIFICATE dbm_certificate WITH SUBJECT = 'dbm';

go

BACKUP CERTIFICATE dbm_certificate TO FILE = '/var/opt/mssql/data/dbm_certificate.cer'

WITH PRIVATE KEY (

FILE = '/var/opt/mssql/data/dbm_certificate.pvk',

ENCRYPTION BY PASSWORD = '**<Private_Key_Password>**'

);

 

2.Run following commands in red1 to copy the certificate to rest of the servers(red2,red3 and red4 in this case)

cd /var/opt/mssql/data

scp dbm_certificate.* root@red2:/var/opt/mssql/data/

scp dbm_certificate.* root@red3:/var/opt/mssql/data/     

scp dbm_certificate.* root@red4:/var/opt/mssql/data/     

3.Give permission to the mssql user to access the certificate files in rest of the servers(red2,red3 and red4 in this case) 

cd /var/opt/mssql/data

chown mssql:mssql dbm_certificate.* 

 

4.Run following T-SQL queries to create the certificate in rest of the nodes by restoring the certificate backup file (red2,red3 and red4 in this case)

use master

go

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**<Master_Key_Password>**'

go

CREATE CERTIFICATE dbm_certificate

   FROM FILE = '/var/opt/mssql/data/dbm_certificate.cer' 

   WITH PRIVATE KEY (

   FILE = '/var/opt/mssql/data/dbm_certificate.pvk',

   DECRYPTION BY PASSWORD = '**<Private_Key_Password>**'

)

 

5.Create endpoint in all servers (red1,red2,red3 and red4)

 

CREATE ENDPOINT [Hadr_endpoint]

    AS TCP (LISTENER_PORT = 5022)

    FOR DATABASE_MIRRORING (

    ROLE = ALL,

    AUTHENTICATION = CERTIFICATE dbm_certificate,

    ENCRYPTION = REQUIRED ALGORITHM AES

);

ALTER ENDPOINT [Hadr_endpoint] STATE = STARTED;

 

6.Run following query in primary replica (red1) to create Availability group(Please note, it works for SQL 2019. If you are using SQL 2017, you need to change AVAILABILITY_MODE of one the replica to ASYNCHRONOUS_COMMIT)

CREATE AVAILABILITY GROUP [ag1]

    WITH (DB_FAILOVER = ON, CLUSTER_TYPE = EXTERNAL)

    FOR REPLICA ON

    N'red1'

        WITH (

        ENDPOINT_URL = N'tcp://red1:5022',

        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,

        FAILOVER_MODE = EXTERNAL,

        SEEDING_MODE = AUTOMATIC) ,

    N'red2'

        WITH (

        ENDPOINT_URL = N'tcp://red2:5022',

        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,

        FAILOVER_MODE = EXTERNAL,

        SEEDING_MODE = AUTOMATIC),

    N'red3'

        WITH (

        ENDPOINT_URL = N'tcp://red3:5022',

        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,

        FAILOVER_MODE = EXTERNAL,

        SEEDING_MODE = AUTOMATIC),

        N'red4'

        WITH (

        ENDPOINT_URL = N'tcp://red4:5022',

        AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,

        FAILOVER_MODE = EXTERNAL,

        SEEDING_MODE = AUTOMATIC)

ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE;--grant create any database permission

 

7. Join the AG group, run the following T-SQL queries in all the secondary servers (red2,red3 and red4 in this case)

ALTER AVAILABILITY GROUP [ag1] JOIN WITH (CLUSTER_TYPE = EXTERNAL);

ALTER AVAILABILITY GROUP [ag1] GRANT CREATE ANY DATABASE

 

8.Run following T-SQL Queries to create database and add it to AG group in primary replica (red1 in this case).

CREATE DATABASE [db1];

ALTER DATABASE [db1] SET RECOVERY FULL;

BACKUP DATABASE [db1] TO DISK = N'/var/opt/mssql/data/db1.bak';

BACKUP log [db1] TO DISK = N'/var/opt/mssql/data/db1.trn';

GO

ALTER AVAILABILITY GROUP [ag1] ADD DATABASE [db1];

 

9.Create SQL login pacemaker in all servers (red1,red2,red3 and red4 in this case).

CREATE LOGIN [pacemakerLogin] with PASSWORD= N'ComplexP@$$w0rd!'

GO

ALTER SERVER ROLE [sysadmin] ADD MEMBER [pacemakerLogin]

 

10.Run following bash command in red1

sudo pcs property set stonith-enabled=false

 

11. In all SQL Server Linux servers , run following bash commands to save the credentials for the SQL Server login.(red1,red2,red3 and red4) (The password is as same as the one used in step 9)

echo 'pacemakerLogin' >> ~/pacemaker-passwd

echo 'ComplexP@$$w0rd!' >> ~/pacemaker-passwd

sudo mv ~/pacemaker-passwd /var/opt/mssql/secrets/passwd

sudo chown root:root /var/opt/mssql/secrets/passwd

sudo chmod 400 /var/opt/mssql/secrets/passwd # Only readable by root

 

12.Create availability group resource at cluster level, run following command on any one of the nodes (just in one server and run just one time).

sudo pcs resource create ag_cluster1 ocf:mssql:ag ag_name=ag1 meta failure-timeout=60s master notify=true

##check the status

13.Run following bash command in primary replica red1 to create one virtual IP resources. The resource name is 'vip1', and IP address is 192.168.2.111

              sudo pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=192.168.2.111

##check the status

 

14. Create Availability group listener for Availability group ag1. Run following T-SQL query in primary replica (red1 in this case).

ALTER AVAILABILITY GROUP [ag1]

ADD LISTENER 'aglistener' (WITH IP

(

('192.168.2.111','255.255.255.0'),

('192.168.4.111','255.255.255.0'),

('192.168.5.111','255.255.255.0')

),PORT = 1433);

 

15. Run following bash commands to create constraints:

sudo pcs constraint colocation add vip1 ag_cluster1-master INFINITY with-rsc-role=Master

sudo pcs constraint order promote ag_cluster1-master then start vip1


 

16.Run following bash command to export the CIB.(you can run the command in any node)

sudo pcs cluster cib <filename>

       

17.You will find following similar entries

<primitive class="ocf" id="vip1" provider="heartbeat" type="IPaddr2">

        <instance_attributes id="vip1-instance_attributes">

          <nvpair id="vip1-instance_attributes-ip" name="ip" value="192.168.2.111"/>

        </instance_attributes>

        <operations>

          <op id="vip1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/>

          <op id="vip1-start-interval-0s" interval="0s" name="start" timeout="20s"/>

          <op id="vip1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>

        </operations>

      </primitive>

 

18.Here is the modified version

<primitive class="ocf" id="vip1" provider="heartbeat" type="IPaddr2">

        <instance_attributes id="vip1-instance_attributes">

          <rule id="Subnet1-IP" score="INFINITY" boolean-op="or">

            <expression id="Subnet1-Node1" attribute="#uname" operation="eq" value="red1"/>

            <expression id="Subnet1-Node2" attribute="#uname" operation="eq" value="red2"/>

          </rule>

          <nvpair id="vip1-instance_attributes-ip" name="ip" value="192.168.2.111"/>

        </instance_attributes>

        <instance_attributes id="vip1-instance_attributes2">

          <rule id="Subnet2-IP" score="INFINITY">

            <expression id="Subnet2-Node1" attribute="#uname" operation="eq" value="red3"/>

          </rule>

          <nvpair id="vip1-instance_attributes-ip2" name="ip" value="192.168.4.111"/>

        </instance_attributes>

        <instance_attributes id="vip1-instance_attributes3">

          <rule id="Subnet3-IP" score="INFINITY">

            <expression id="Subnet3-Node1" attribute="#uname" operation="eq" value="red4"/>

          </rule>

          <nvpair id="vip1-instance_attributes-ip3" name="ip" value="192.168.5.111"/>

        </instance_attributes>

        <operations>

          <op id="vip1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/>

          <op id="vip1-start-interval-0s" interval="0s" name="start" timeout="20s"/>

          <op id="vip1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/>

        </operations>

      </primitive>

 

19. Run following command to import the modified CIB and reconfigure Pacemaker.

sudo pcs cluster cib-push <filename>


Here are the takeaway points:

1).All nodes in same subnet should be in the same <Instance_attributes>


2).If there are more than one servers in the subnet, the keyword ‘boolean-op="or"’ is a must


3).The IP address of Alwayson Listener is addressed in <nvpair> .


4).The value of id property does not matter, you can specify any value as long as the value is  unique


Optional, you can create three entries for the three IP addresses in the DNS server.


 

Failover

===

pcs resource move resourceName nodeName


Here is a screenshot of failover test

Here is an screenshot of using SQLCMD to connect the AGListener that failover from node1 to node4



Comments

popular posts

SQL Server Polybase in Failover cluster instance

Tutorial: Create SQL Cluster(FCI) on RHEL

Realcase: Failed to upgrade SQL Server 2016 SP2 CU11. (Installation success or error status: 1648)

How to find SQL Server Replication related jobs and T-SQL statements

SQL Server Extents, PFS, GAM, SGAM and IAM and related corruptions

Tutorial: Add a node to SQL cluster on RHEL

SQL Server GAM corruption samples

You may fail to backup log or restore log after TDE certification/key rotation

Password is required when adding a database to AG group if the database has a master key