Performance

For repetitive operations, like a high volume of signings using the same key, an HA group can expand SafeNet Luna Network HSM performance in linear fashion as HA group members are added. HA groups of 32 members have undergone long-term, full-throttle testing, with excellent results.

Do keep in mind that simply adding more and more SafeNet Luna Network HSM appliances to an HA group is not an infallible recipe for endless performance improvement. For best overall performance, all HA group members should be driven near their individual performance "sweet spot", around 30 simultaneous threads per HSM. If you assemble an HA group that is considerably larger than your server(s) can drive, then you might not achieve full performance from all.

The best approach is an HA group balanced in size for the capability of the application servers that will be driving the group, and the expected loads - with an additional unit to provide capacity for bursts of traffic and for redundancy.

Maximizing Performance

The SafeNet Luna Network HSM used in HA can provide performance improvement for asymmetric single-part operations. Gigabit Ethernet connections are recommended to maximize performance. For example, we have seen as much as a doubling of asymmetric single-part operations in a two-member group in a controlled laboratory environment (without crossing subnet boundaries, without competing traffic or other latency-inducing factors).    

Multi-part operations are not load-balanced by the SafeNet HA due to the overhead that would be needed to perform context replication for each part of a multi-part operation.

Single-part cryptographic operations are load-balanced by the SafeNet HA functionality under most circumstances. Load-balancing these operations provides both scalability (better net throughput of operations) and redundancy by supporting transparent fail-over.

Performance is Dependent on the Type of Operation

Performance is also affected by the kind of operation you are performing. HA is better for performance when all HSM operations are performed on keys and material that reside within the HSM. This changes if part of the operation involves importing and unwrapping of keys; it can be instructive to consider what happens when such HSM operations are performed both with and without HA.

With HA

> One encryption (to wrap the key)

>One decryption in the HSM (to unwrap the key)

>Object creation on the HSM (the unwrapped key is created and stored as a key object)

>Key replication happens for HA

RSA 4096-bit operation used to derive a shared secret between HSM

Encryption of the key on the primary HA member using the shared secret

Decryption of the key on the secondary HA member HSM using the shared secret

Object creation on the second HA member

>One encryption (uses the unwrapped key object to encrypt the data)

Without HA

>One encryption (to wrap the key)

>One decryption in the HSM (to unwrap the key)

>Object creation on the HSM (the unwrapped key is created and stored as a key object)

>One encryption (uses the unwrapped key object to encrypt the data)

From the above it is apparent that, with HA, many more operations are performed. Most significant in the above case are the RSA 4096-bit operation and the additional object creation performed. Those two operations are by far the slowest operations in the list, and so this type of task would have much better performance without HA.

By contrast, if the task had made use of objects already within the HSM, then at most a single synchronization would have propagated the objects to all HA members, and all subsequent operations would have seen a performance boost from HA operation. The crucial consideration is whether the objects being manipulated are constant or are constantly being replaced.

HA and FindObjects

How your application uses the C_FindObjects function to search for objects in a virtual HA slot can have a significant impact your application performance. See Application Object Handles for more information.