You are here: Administration & Maintenance Manual > Appliance Administration > HA and Load Balancing > HA Recovery/autoRecovery

Administration & Maintenance - HA & Load Balancing

 

HA Recovery

HA recovery is hands-off resumption by failed HA Group members, or it is manual re-introduction of a failed member if "autorecovery" has not been switched on. Some reasons for a member to fail from the group might be:

- the appliance loses power (but regains power in less than the 2 hours that the HSM preserves its activation state)

- the network link from the unit is lost and then regained.

HA recovery takes place if:

If all HA nodes fail (no links from client) no recovery is possible.

The HA recovery logic in the library makes its first attempt at recovering a failed member when your application makes a call to its HSM (the group). That is, an idle client does not start the recovery-attempt process.

On the other hand, a busy client would notice a slight pause every minute, as the library attempts to recover a dropped HA group member (or members) until the member has been reinstated or until the timeout has been reached and it stops trying. Therefore, set the number of retries according to your normal situation (the kinds and durations of network interruptions you experience, for example).

HA Autorecovery vs Manual Recovery

Autorecovery is not on by default. It must be explicitly enabled with vtl haAdmin -autorecovery command.

Use manual recovery whenever you have multiple processes or clients sharing a partition. Using automatic recovery with multiple processes sharing a partition could lead to a collision.

 

For practical steps to replace a failed HA group member, see "HA Replacing a Failed Luna SA".

See Also