Manually Recovering a Failed HA Group Member
Thales recommends using auto-recovery for all HA group configurations (see Configuring HA Auto-Recovery). If you do not enable auto-recovery and a member partition fails, or if the recovery retry count expires before the partition comes back online, you must recover the partition manually using LunaCM. You do not need to pause your application(s) to perform a manual recovery; the HA group handles load-balancing and automatically replicates any new or changed keys to the recovered member.
To perform a manual recovery of a failed HA group member
1.[Optional] Ensure that the failed member is available and visible in LunaCM by addressing the problem that caused the failure. Display the HA group to see the failed members (hagroup listgroups). You are prompted for the Crypto Officer password/challenge secret.
lunacm:>hagroup listgroups
lunacm:> hagroup listgroups
If you would like to see synchronization data for group myHAgroup,
please enter the password for the group members. Sync info
not available in HA Only mode.
Enter the password: ********
HA auto recovery: disabled
HA recovery mode: activeBasic
Maximum auto recovery retry: 0
Auto recovery poll interval: 60 seconds
HA logging: disabled
Only Show HA Slots: yes
HA Group Label: myHAgroup
HA Group Number: 1154438865287
HA Group Slot ID: 5
Synchronization: enabled
Group Members: 154438865287, 1238700701509
Needs sync: no
Standby Members: <none>
Slot # Member S/N Member Label Status
====== ========== ============ ======
------ 154438865287 par0 alive
------ 1238700701509 ------------ down
2.If you are using a PED-authenticated partition with auto-activation disabled, or if the partition was down for longer than two hours, log in to the partition as Crypto Officer and present the black CO PED key.
lunacm:>slot set -slot <slotnum>
lunacm:>role login -name co
3.Execute the manual recovery command, specifying the HA group label (hagroup recover).
lunacm:>hagroup recover
lunacm:> ha recover -g myHAgroup
Signal sent to HA Group "myHAgroup" to recover.
Command Result : No Error
If you have an application running on the HA group, the failed members will be recovered the next time an operation is scheduled. Load-balancing and key replication is automatic.
4.If you do not currently have an application running, you can manually synchronize the contents of the HA group (hagroup synchronize).
CAUTION! Never use manual synchronization if you have an application running. The HA group performs this automatically. Using this command on an HA group that is running an application could create conflicting key versions.
lunacm:>hagroup synchronize -group <label>
lunacm:> hagroup synchronize -group myHAgroup
Enter the password: ********
Synchronization completed.
Command Result : No Error