Managing HA Groups

If you set up your HA groups as recommended, using auto-recovery, they require very little direct maintenance. You can perform the following tasks without pausing your applications:

>Adding/Removing an HA Group Member

>Manually Recovering a Failed HA Group Member --If you declined to use auto-recovery, you must manually recover group members whenever they fail

>Replacing an HA Group Member -- If an HSM fails permanently, or is re-initialized, the member partition cannot be recovered

>Deleting an HA Group

Adding/Removing an HA Group Member

You can add a new member to an HA group at any time using LunaCM, even if your application is running. Cryptographic objects will be replicated on the new partition and operations will be scheduled according to the load-balancing algorithm (see Load Balancing).

Likewise, you can remove a member at any time, and currently-scheduled operations will fail over to the rest of the group members (see Failover).

NOTE   If you remove the partition that was used to create the group, the HA group serial number changes to reflect this. This is to prevent another HA group from being assigned the same serial number as the original. If your application queries the HA group serial number, it must redirect operations to the new serial.

Prerequisites

The new member partition must:

>be assigned to the client and visible in LunaCM

>be initialized with the same domain string/red domain iKey as the other partitions in the group

>have the Crypto Officer role initialized with the same credentials as the other partitions in the group

>be activated(multifactor quorum-authenticated)

NOTE   V1 partitions: If you add an application partition with an existing SMK to an HA group, the primary member's SMK overwrites the existing SMK of the joining partition.

If a partition's SMK has ever been used to encrypt important SKS objects, save a backup of the SMK before adding that partition to any HA group.

To add an HA group member

1.Open LunaCM on the client workstation and ensure that the new partition is visible.

2.Add the new partition to the HA group by specifying either the slot or the serial number. You are prompted for the Crypto Officer password/challenge secret.

lunacm:> hagroup addmember -group <label> {-slot <slotnum> | -serial <serialnum>}

To remove an HA group member

1.Remove the partition from the group by specifying either the slot or the serial number.

lunacm:> hagroup removemember -group <label> {-slot <slotnum> | -serial <serialnum>}

NOTE   If you remove the partition that was used to create the group, the HA group serial number changes to reflect this. This is to prevent another HA group from being assigned the same serial number as the original. If your application queries the HA group serial number, it must redirect operations to the new serial.

LunaCM restarts.

2.[Optional] Check that the partition was removed from the group.

lunacm:> hagroup listgroups

Manually Recovering a Failed HA Group Member

Thales recommends using auto-recovery for all HA group configurations (see Configuring HA Auto-Recovery). If you do not enable auto-recovery and a member partition fails, or if the recovery retry count expires before the partition comes back online, you must recover the partition manually using LunaCM. You do not need to pause your application(s) to perform a manual recovery; the HA group handles load-balancing and automatically replicates any new or changed keys to the recovered member.

To perform a manual recovery of a failed HA group member

1.[Optional] Ensure that the failed member is available and visible in LunaCM by addressing the problem that caused the failure. Display the HA group to see the failed members. You are prompted for the Crypto Officer password/challenge secret.

lunacm:> hagroup listgroups

2.If you are using a multifactor quorum-authenticated partition, log in to the partition as Crypto Officer and present the black CO iKey.

lunacm:> slot set -slot <slotnum>

lunacm:> role login -name co

3.Execute the manual recovery command, specifying the HA group label.

lunacm:> hagroup recover

If you have an application running on the HA group, the failed members will be recovered the next time an operation is scheduled. Load-balancing and key replication is automatic.

4.If you do not currently have an application running, you can manually synchronize the contents of the HA group.

CAUTION!   Never use manual synchronization if you have an application running. The HA group performs this automatically. Using this command on an HA group that is running an application could create conflicting key versions.

lunacm:> hagroup synchronize -group <label>

Replacing an HA Group Member

Sometimes an HSM failure is permanent (from the perspective of the HA group). For example, if the HSM is re-initialized, the member partition is erased and must be recreated. In this case, you can recreate a partition on the same HSM or another HSM, and deploy the new member to the group. You do not need to pause your application to replace an HA group member.

Prerequisites

The Crypto Officer must complete this procedure, but any new member partition must first be created by the HSM SO, and initialized by the Partition SO. All the prerequisites listed in Configuring a High-Availability Group must be met.

NOTE   V1 partitions: If you add an application partition with an existing SMK to an HA group, the primary member's SMK overwrites the existing SMK of the joining partition.

If a partition's SMK has ever been used to encrypt important SKS objects, save a backup of the SMK before adding that partition to any HA group.

To replace an HA group member

1.[Optional] Display the HA group to see the failed member. You are prompted for the Crypto Officer password/challenge secret.

lunacm:> hagroup listgroups

2.Prepare the new HA group member, whether that means creating a new partition on the original HSM or configuring a new Luna USB HSM 7, and assign the new partition to the HA client. Ensure that the new member partition and the HSM on which it resides meet the prerequisites outlined in Configuring a High-Availability Group and is visible in LunaCM.

3.Add the new partition to the HA group by specifying either the slot or the serial number. You are prompted for the Crypto Officer password/challenge secret.

lunacm:> hagroup addmember -group <label> {-slot <slotnum> | -serial <serialnum>}

The new partition is now an active member of the HA group. If you have an application currently running, cryptographic objects are automatically replicated to the new member and it is assigned operations according to the load-balancing algorithm.

4.Remove the old partition from the group by specifying the serial number.

lunacm:> hagroup removemember -group <label> -serial <serialnum>

LunaCM restarts.

5.[Optional] If you do not currently have an application running, you can manually synchronize the contents of the HA group.

CAUTION!   Never use manual synchronization if you have an application running. The HA group performs this automatically. Using this command on an HA group that is running an application could create conflicting key versions.

lunacm:> hagroup synchronize -group <label>

6.[Optional] If you intend to have the new partition serve as a standby member, see Setting an HA Group Member to Standby.

Deleting an HA Group

Use LunaCM to delete an HA group from your configuration.

NOTE   This procedure only removes the HA group virtual slot; the member partitions and all their contents remain intact. Only the HSM SO can delete individual partitions.

To delete an HA group

1.Stop any applications currently using the HA group.

2.Delete the group by specifying its label (see hagroup listgroups).

lunacm:> hagroup deletegroup -label <label>