HA Troubleshooting

If you encounter problems with an HA group, refer to this section.

Administration Tasks on HA Groups

Do not attempt to run administrative tasks on an HA group virtual slot (such as changing the CO password or altering partition policies). These virtual slots are intended for cryptographic operations only. It is not possible to use an HA group to make administrative changes to all partitions in the group simultaneously.

Unique Object IDs (OUID)

If two applications using the same HA group modify the same object using different members, the object fingerprint may conflict.

Client-Side Limitations

New features or abilities, or new cryptographic mechanisms added by firmware update, or previously usable mechanisms that become restricted for security reasons, can have an impact on the working of an HA group, when the Client version is older. Luna Clients are "universal" in the sense that they are able to work fully with current Luna HSMs/partitions, and with earlier versions, as well as with cloud crypto solutions (DPoD), but a client version cannot be aware of HSM versions that were not yet developed when the Client was released.

Client-Side Failures

Any failure of the client (such as operating system problems) that does not involve corruption or removal of files, should resolve itself when the client is rebooted.

If the client workstation seems to be working fine otherwise, but you have lost visibility of the HSMs in LunaCM or your client, try the following remedies:

>verify that the Thales drivers are running, and retry

>reboot the client workstation

>restore your client configuration from backup

>re-install Luna HSM Client and re-configure the HA group

Failures Between the HSM Appliance and Client

The only failure that could likely occur between a Luna Network HSM (or multiple HSMs) and a client computer coordinating an HA group is a network failure. In that case, the salient factor is whether the failure occurred near the client or near one (or more) of the Luna Network HSM appliances.

If the failure occurs near the client, and you have not set up port bonding on the client, then the client would lose sight of all HA group members, and the application fails. The application resumes according to its timeouts and error-handling capabilities, and HA resumes automatically if the members reappear within the recovery window that you had set.

If the failure occurs near a Luna Network HSM member of the HA group, then that member disappears from the group until the network failure is cleared, but the client can still see other members, and normal failover occurs.

Effect of PED Operations

PED operations can block some cryptographic operations, so that while a member of an HA group is performing a PED operation, it could appear to the HA group as a failed member. When the PED operation is complete, failover and recovery HA logic are invoked to return the member to normal operation.