High Availability (HA) Overview

You can use the SafeNet Luna HSM client to group multiple devices, or partitions, into a single logical group – known as an HA (High Availability) group. When you create an HA group, it is listed as a virtual HA slot in the client. Any applications that use the virtual HA slot can access cryptographic services as long as at least one member of the HA group remains functional and connected to the application server. In addition, the client performs load balancing among the HA group members, allowing many cryptographic commands to be automatically distributed across the HA group, and enabling linear performance gains for many applications.

How HA is Implemented

The HA and load-balancing functionality is implemented in the SafeNet Luna HSM client, and uses the cloning function to replicate/synchronize content across HA-group members. There is no direct connection between the members of an HA group, and all communications between the members of an HA group are managed by the client. The HSMs and appliances are not involved and, except for being instructed to clone objects to certain HSMs during a synchronization operation, are unaware that they might be configured in an HA group. The advantage of this approach is that it allows you to configure HA groups on a per-application (or per-slot) basis.

To create an HA group, you must first register your client with each HSM you want to include in the HA group. You then use the client-side administration commands to define the HA group and set any desired configuration options. You can configure several options including:

>Setting automatic or manual recovery mode

>Setting some HSMs as standby members

>Performing various manual synchronization and recovery operations

Once defined, the SafeNet Luna HSM client presents the HA group as a virtual slot, which is a consolidation of all the physical HSMs in the HA group. Any operations that access the slot are automatically distributed between the group members, to provide load balancing, and all key material is automatically replicated and synchronized between each member of the HA group.

Example: Database Encryption

This section walks through a specific sample use case of some of the HA logic with a specific application – namely a transparent database encryption.

Typical Database Encryption Key Architecture

Database engines typically use a two-layered key architecture. At the top layer is a master encryption key that is the root of data protection. Losing this key is equivalent to losing the database, so it obviously needs to be highly durable. At the second layer are table keys used to protect table-spaces and/or columns. These table keys are stored with the database as blobs encrypted by the master encryption key (MEK). This architecture maps to the following operations on the HSM:

1. Initial generation of master key for each database.

2. Generation and encryption of table keys with the master key.

3. Decryption of table keys when the database needs to access encrypted elements.

4. Generation of new master keys during a re-key and then re-encrypting all table keys with it.

5. Generation and encryption of new table keys for storage in the database (often done in a software module).

The HSM is not involved in the use of table keys. Instead it provides the strong protection of the MEK which is used to protect the table keys. Users must follow backup procedures to ensure their MEK is as durable as the database itself. Refer to the backup section of this manual for proper backup procedures.

HSM High Availability with Database Encryption

When the HSMs are configured as an HA group, the database’s master key is automatically and transparently replicated to all the members when the key is created or re-keyed. If an HSM group member was offline or fails during the replication, it does not immediately receive a copy of the key. Instead the HA group proceeds after replicating to all of the active members. Once a member is re-joined to the group the HSM client automatically replicates the new master keys to the recovered member.

With this in mind, before every re-key event the user should ensure the HA group has sufficient redundancy. A re-key will succeed so long as one HA group member exists, but proceeding with too few HSMs will result in an availability risk. For example, proceeding with only one HSM means the new master key will be at risk since it exists only on a single HSM. Even with sufficient redundancy, SafeNet recommends maintaining an offline backup of a database’s master key.

HSM Load Balancing with Database Encryption

While a database is up and running, the master key exists on all members in the HA group. As such, requests to encrypt or decrypt table keys are distributed across the entire group. So the load-balancing feature is able to deliver improved performance and scalability when the database requires a large number of accesses to the table keys. With that said, most deployments will not need much load-balancing as the typical database deployment results in a small number of table keys.

While the table keys are re-keyed, new keys are generated in the HSM and encrypted for storage in the database. Within an HA group, these keys are generated on the primary HSM and then, even though they exist on the HSM for only a moment, they are replicated to the entire HSM group as part of the availability logic. These events are infrequent enough that this extra replication has minimal impact.

Conclusion

The SafeNet high availability and load balancing features provide an excellent set of tools to scale applications and manage availability of cryptographic services without compromising the integrity of cryptographic keys. A broad range of deployment options are supported that allow solution architects to achieve the availability needed in a manner that optimizes the cost and performance without compromising the assurance of the solution.