Please Note:

Clusters and Nodes

Note

This feature is not included in Community Edition and requires a valid Virtual CipherTrust Manager license. To activate your instance with a trial evaluation, or a term or perpetual license, see Licensing.

Overview

Clusters and nodes are the resources used to create and manage CipherTrust Manager clustering.

A cluster is a group of connected CipherTrust Manager appliances that share data. A cluster node is an individual appliance within the cluster. A cluster configuration is needed to start grouping CipherTrust Manager appliances together. Application Administrators, such as admin, can perform cluster operations in the root domain.

The main purpose of clustering is to support High Availability. When clustered, all cluster nodes continuously synchronize their databases with each other. Any node in the cluster may be contacted for any operation. A cluster supports any combination of appliance models; both virtual and physical appliances can join the same cluster. However, all cluster nodes must be the same firmware version during normal operation, and this is enforced during the join operation. Upgrading CipherTrust Manager firmware version for nodes in a cluster requires a special sequence.

Nodes in a cluster communicate over port 5432. Members of the cluster must have bi-directional access to each other on port 5432.

Note

For proper cluster functionality, every node must be synchronized to the same time. Otherwise, communication between nodes fails. You must manually configure the same NTP server for each node. The NTP server configuration is not replicated, so you must configure each node independently.

Cluster Disconnection

If a cluster node is disconnected or powers down, an alarm is set and syslog messages are sent.

When this happens, you should first monitor the node and perform basic troubleshooting to determine if the disconnection is temporary, and the node will eventually resynchronize to the cluster.

If these actions don't resolve the connection, you should consider the node permanently disconnected and remove the node from the cluster.

Initial Monitoring and Troubleshooting

First determine if the disconnection is temporary and recoverable. When the node comes back it synchronizes with other members of the cluster. It can take a few minutes to a few hours for data to fully synchronize.

To monitor cluster node status during this time, there are multiple options.

As an Application Administrator such as adminview the cluster summary in ksctl.
As an Application Administrator view the status of all nodes in ksctl.
As an Application Administrator view cluster error information in ksctl.
As an Application Administrator view the status of an individual cluster node in ksctl or in the GUI.
If the above statuses display errors, obtain more detailed information to troubleshoot. As ksadmin in an SSH session, use kscfg cluster-report to generate a detailed information on each node. Move the resulting tar.bz2 file to a different machine and unpack it. View the following data files in edb-lasso-report-<UUID>/postgresql/dbs/kylo/bdr:
- node_summary.data
- worker_errors.data
- monitor_group_raft_details.data
- node_slots.data
- node_replication_rates.data

If a node lists its state as "down" persistently, and the dataLag value in the node status results keeps increasing over time, those are indications that the connection will not repair itself, and you should consider the node permanently disconnected.

Recover from Permanent Disconnection

In this scenario, we recommend that you remove the disconnected node from the cluster. This prevents the remaining nodes from continually storing catch-up changes for the missing node, which would eventually fill up their local volumes.

Issue the ksctl cluster nodes delete command to one remaining node.
That node then sends the same command to the rest of the cluster.
Reset the disconnected node.
```
kscfg system reset -y
```
Warning
Refer to section Services Reset for important information on using this command.
Join the disconnected node back to the cluster.

Cluster Size and Performance

The maximum number of cluster nodes is 20. Joining a 21st node will fail.

For optimal performance and stability, the recommended cluster size is 4-6 nodes. If you desire a larger cluster size, be sure to plan carefully for these scalability considerations:

Good connectivity is required between every node. As each node is continually connected to every other node, network issues can cause synchronization problems between nodes, creating maintenance and troubleshooting overhead.
Upgrading firmware for a larger cluster can be complex and time-consuming.
- Each node must be upgraded individually.
- For online in-place upgrade, running different CipherTrust Manager firmware versions within a cluster is only supported for one version back, and only during the upgrade window.

What is not clustered?

Most things replicate across the cluster. However, the following items are specific to an individual node, and are not clustered:

Backup files
Backup keys
Debug, KMIP Activity, and NAE Activity logs
NTP configuration
HSM configuration
Instance name
Virtual CipherTrust Manager license
Note
Licenses for connector applications are replicated. This is because the 'Connector Lock Code', unlike the 'Key Manager Lock Code', is cluster-wide.
Interface certificates
Proxy settings

CipherTrust Manager internally uses a flexible security architecture to allow clustering instances with both shared and distinct 'root of trust' configurations.

If the member node and the new node share the same HSM partition, it is recommended to pass the flag --shared-hsm-partition during nodes create or fulljoin, and to set each node to use the same Root of Trust (RoT) key using ksctl rot-keys rotate --id <root-of-trust-id>. This configuration further increases the security of cluster join procedure by ensuring that CipherTrust Manager secrets and master keys are protected by an HSM-secured RoT key during the join process, when these objects are transferred between nodes for the first time.

When the new node uses a distinct HSM partition or does not use any HSM, it is still possible to join it to an existing cluster by not specifying the --shared-hsm-partition flag.

Caution

Each CipherTrust Manager node in the cluster uses its own root of trust configuration to protect the Key Encryption Key (KEK) chain that secures the sensitive data of the cluster. When a new node, not connected to an HSM, joins a CipherTrust Manager cluster with all HSM connected nodes, it becomes the weakest point in the key hierarchy and potentially weakens the overall security of the cluster.

Managing Clusters and Nodes through UI

Cluster management operations are available on the Admin Settings>Cluster page of the UI in the root domain, to Application Administrators such as admin.

Creating a New Cluster Configuration

A cluster configuration is necessary before you can join nodes to the cluster.

Login to the root domain as an Application Administrator, such as admin. The CipherTrust Manager you log into becomes the first node of the new cluster.
Navigate to Admin Settings > Cluster.
Click Manage Cluster and select Add cluster from the dropdown options.
Review the Introduction describing the effects of creating a cluster configuration.
Click Next.
Provide the Privately accessible IP Address or Hostname. This is the IP address or hostname used for data replication between cluster nodes.
Review the Public address connectors will use to access this node value and change it if necessary. In most cases, this is the same value as the Privately accessible IP Address or Hostname. This is the Fully Qualified Domain Name or public IP address of the CipherTrust Manager you are logged into. This is the value which external clients use to communicate with the CipherTrust Manager.
Click Add Cluster.
The current CipherTrust Manager is listed as a Host in the table, with a Status of ready.

Joining a Node to a Cluster

Joining involves operations on two CipherTrust Manager instances.

A joining node - This is a node that you wish to add into the cluster. Some services restart on the joining node during the join, making the node inaccessible. As well, its data is reset and synchronized with the cluster.
A member node - This is a node that is already in the cluster, and which can exchange certificates with the joining node.

Starting in version 2.12, CipherTrust Manager enforces that the joining node must be the same firmware version as the nodes already in the cluster. For example, if you attempt to join a node running 2.11 firmware to a cluster of nodes with 2.12 firmware, the join operation fails.

Caution

Restoring a backup to the node immediately before the join is unnecessary and strongly discouraged, as joining a cluster resets and synchronizes the node's data. If you have recently restored a backup on the CipherTrust Manager, make sure the restore operation is complete, all services are up and running, and that at least 10 minutes have passed. Then, you can join the CipherTrust Manager to a cluster. If you join a cluster before or soon after a restore is complete, this can result in other nodes in the cluster failing to reboot.

To join a node to a cluster

Login to the root domain as an Application Administrator, such as admin to a member node.
Navigate to Admin Settings > Cluster.
Click Manage Cluster and select Add node from the dropdown options.
Provide connection information for the joining node.
- Public URL of the new node - The general public hostname, IP address, or FQDN of the joining node.
- Connector Public address of the new node - The hostname, IP address, or FQDN of the joining node accessible by connector clients.
If you are using a separate network for cluster nodes to communicate with each other:
1. Uncheck the Cluster address is the same as the Public URL hostname option
2. Provide the Cluster network address of the new node.
Note
The IP address or FQDN you provide here is used by the nodes to communicate with each other. For example, in AWS deployments, if the nodes are on the same VPC, the internal IP address of the member and joining node should be used, not a public IP.
Additionally, to successfully join a cluster, a CipherTrust Manager's hostname can consist only of lowercase letters, numbers, and hyphens. The default hostname of ciphertrust meets these criteria. The ksadmin user can view and change the hostname with the kscfg utility.
To successfully join a cluster using an FQDN, uppercase letters are not allowed.
If you wish to use a shared HSM root of trust configuration, enable the Shared HSM Partition checkbox.
Click Add node.
The current node now attempts to connect to the joining node. 60 connection attempts are allowed. Another tab opens in the broswer for the joining node's Cluster page.
Switch to the other window or tab to the joining node to accept the request. Login as an Application Administrator, such as admin.
A popup Join Node to Cluster is displayed.
Review the displayed information.
- The origin of the request indicates the IP address or hostname of the cluster node making the request.
- The IP address/hostname and port for cluster operations on the joining node are displayed.
If the information is correct, click Join to confirm.
Screens on both CipherTrust Manager demonstrate the joining node passing a certificate signing request to the member node to sign, the member node passing the signed certificate back, and then the join operation succeeding.
The joining node restarts, and the cluster replicates data to the node.
When the operation is complete, the joining node is listed as a Host in the table, with a Status of ready.

Caution

The join operation might take some time to complete, depending on network speed and database size. Completion times of 30 minutes are not unusual.

While a join is underway, do not restart the system, as this can cause the join to fail, and the node to disappear from the cluster. There is no need to manually restart the system after join is completed as well.

You can always view the node status to check progress while a join is underway. During the join, the node displays a joining status and indicates progress through five stages: creating, bootstrapping, initial sync, catching up, and completing; the down status can also appear for a short time. If there is a failure during joining, the node status is killed or unknown, or a persistent down status.

Removing a Node from a Cluster

To remove a node from a cluster, first remove that node from the cluster then delete the cluster configuration on the removed node. Deleting the cluster configuration allows a node to rejoin the original cluster, join another cluster, or create a new cluster.

Login to the root domain as an Application Administrator, such as admin, to a node in the cluster which will remain in the cluster.
Note
You cannot remove the node you are logged into from the cluster.
Navigate to Admin Settings > Cluster.
In the table, find the node you want to remove. Click the overflow icon () corresponding to the desired node and click Remove Node.
A confirmation pop-up displays.
Click Remove to confirm.
Login to the root domain as an Application Administrator, such as admin, to the newly removed node.
Navigate to Admin Settings > Cluster.
Click Manage Cluster and select Delete cluster configuration from the dropdown options.
A confirmation pop-up displays.
Click Delete to confirm. This operation restarts the CipherTrust Manager and can take up to 5 minutes.

Re-joining a Node to a Cluster

Before rejoining it is recommended to reset the node.

To reset the node enter the command in an SSH session as the ksadmin user:

$ kscfg system reset -y

Warning

Refer to section Services Reset for important information on using this command.

Join the node as described in Joining a Node to a Cluster.

Deleting a Cluster

Remove all but one node from the cluster, taking care to delete the cluster configuration on each removed node.
Login to the root domain as an Application Administrator, such as admin, to the last remaining node in the cluster.
Navigate to Admin Settings > Cluster.
Click Manage Cluster and select Delete cluster configuration from the dropdown options.
A confirmation pop-up displays.
Click Delete to confirm. This operation can take up to 5 minutes.

Viewing Cluster Node Status

You can view information on the general health of cluster nodes through the web console UI. More detailed replication health information is available through the "ksctl cluster summary" CLI command.

Login to the root domain as an Application Administrator, such as admin, to the last remaining node in the cluster.
Navigate to Admin Settings > Cluster.
View the cluster nodes listed in the table.
Each Host lists a Status Code and Status. As well there is a visual depiction of nodes and their statuses below the table, with hexagons representing individual nodes.

The possible statuses are:

nodes are in different states across databases
not clustered
ready
joining: creating (1/5)
joining: bootstrapping (2/5)
joining: initial sync (3/5)
joining: catching up (4/5)
joining: completing (5/5)
killed
down
unknown

Managing Clusters and Nodes through ksctl CLI

Use the ksctl command line interface for managing clusters and nodes. The relevant commands are:

ksctl cluster csr
ksctl cluster delete
ksctl cluster error
ksctl cluster fulljoin
ksctl cluster info
ksctl cluster join
ksctl cluster new
ksctl cluster nodes list
ksctl cluster nodes get
ksctl cluster nodes create
ksctl cluster nodes delete

Note

All global flags can be configured using a configuration file. Otherwise, you will need to specify the URL, username and password for each call as demonstrated in a few examples below.

Check the Status of a Cluster

To check the status of a cluster enter the command:

ksctl cluster info

This returns the following response:

{
  "nodeID": "",
  "status": {
    "code": "none",
    "description": "not clustered"
  },
}

Create a New Cluster Configuration

Make the following call on a running appliance (you will need to insert the hostname/IP of the appliance). When running cluster new, --host and --url are most likely going to be identical except that --host will not have the protocol. To create a new cluster enter the command:

ksctl cluster new --host=localHostName --url=urlOfCurrentNode --user=username --password=Password_1

This returns the following response:

{
  "nodeID": "ab40e178-5f1d-4f03-8b26-7ca378f74988",
  "status": {
    "code": "r",
    "description": "ready"
  },
  "nodeCount": 1
}

Show the Status of All Nodes in the Cluster

This command shows the status of the nodes in the cluster from the current node's point of view. This means that you do not see detailed replication health information about the cluster node the command is run on. To return full information for every node, including the current node, use the cluster summary command.

Enter the following command on any node:

ksctl cluster nodes list

This returns the following response:

{
  "skip": 0,
  "limit": 256,
  "total": 1,
  "resources": [
    {
      "nodeID": "c05ead39-94ac-4459-a371-9915e3e16ebf",
      "status": {
        "code": "r",
        "description": "ready"
      },
      "host": "kylo_pg_1",
      "isThisNode": true
    }
  ]
}

The possible statuses are:

nodes are in different states across databases
not clustered
ready
joining: creating (1/5)
joining: bootstrapping (2/5)
joining: initial sync (3/5)
joining: catching up (4/5)
joining: completing (5/5)
killed
down
unknown

Show the Status of a Single Node in the Cluster

To show the status of a single node in a cluster enter the command:

ksctl cluster nodes get --id=c05ead39-94ac-4459-a371-9915e3e16ebf

This returns the following response:

{
  "nodeID": "c05ead39-94ac-4459-a371-9915e3e16ebf",
  "status": {
    "code": "r",
    "description": "ready"
  },
  "host": "kylo_pg_1",
  "isThisNode": true
}

Show a summary of all nodes in the cluster

To show a summary of all nodes in a cluster, run:

ksctl cluster summary

This returns the following response:

{
    "skip": 0,
    "limit": 0,
    "total": 3,
    "resources": {
        "2dda1a9e4de4438b9fab1c7efc95bea8": {
            "clusterSummary": {
                "2dda1a9e4de4438b9fab1c7efc95bea8": {
                    "clusterErrors": null,
                    "nodeInfo": {
                        "host": "54.224.193.194",
                        "isThisNode": true,
                        "nodeID": "2dda1a9e4de4438b9fab1c7efc95bea8",
                        "port": 5432,
                        "publicAddress": "54.224.193.194",
                        "status": {
                            "code": "r",
                            "description": "ready"
                        }
                    },
                    "summary": "GOOD"
                },
                "f374bb3eab384692843e8dc60a9c2da2": {
                    "clusterErrors": null,
                    "nodeInfo": {
                        "applyRate": 2500,
                        "catchupInterval": "00:00:00",
                        "connectTime": "2022-08-24T05:40:23.706218Z",
                        "connected": true,
                        "flushLSN": "0/50D6ED0",
                        "flushLag": "00:00:00.000873",
                        "flushLagBytes": 0,
                        "flushLagSize": "0 bytes",
                        "host": "54.144.33.167",
                        "isThisNode": false,
                        "nodeID": "f374bb3eab384692843e8dc60a9c2da2",
                        "port": 5432,
                        "publicAddress": "54.144.33.167",
                        "replayLSN": "0/50D6ED0",
                        "replayLag": "00:00:00.000873",
                        "replayLagBytes": 0,
                        "replayLagSize": "0 bytes",
                        "replicationBlocked": false,
                        "sentLSN": "0/50D6ED0",
                        "sentLagBytes": 0,
                        "sentLagSize": "0 bytes",
                        "state": "streaming",
                        "status": {
                            "code": "r",
                            "description": "ready"
                        },
                        "uptime": "00:03:42.332005",
                        "writeLSN": "0/50D6ED0",
                        "writeLag": "00:00:00.000873",
                        "writeLagBytes": 0,
                        "writeLagSize": "0 bytes"
                    },
                    "summary": "GOOD"
                },
                "f50adf6919c047f0b83c2bf73553f962": {
                    "clusterErrors": null,
                    "nodeInfo": {
                        "applyRate": 2504,
                        "catchupInterval": "00:00:00",
                        "connectTime": "2022-08-24T05:28:49.73947Z",
                        "connected": true,
                        "flushLSN": "0/50D6ED0",
                        "flushLag": "00:00:00.001327",
                        "flushLagBytes": 0,
                        "flushLagSize": "0 bytes",
                        "host": "18.234.75.110",
                        "isThisNode": false,
                        "nodeID": "f50adf6919c047f0b83c2bf73553f962",
                        "port": 5432,
                        "publicAddress": "18.234.75.110",
                        "replayLSN": "0/50D6ED0",
                        "replayLag": "00:00:00.001327",
                        "replayLagBytes": 0,
                        "replayLagSize": "0 bytes",
                        "replicationBlocked": false,
                        "sentLSN": "0/50D6ED0",
                        "sentLagBytes": 0,
                        "sentLagSize": "0 bytes",
                        "state": "streaming",
                        "status": {
                            "code": "r",
                            "description": "ready"
                        },
                        "uptime": "00:15:16.298753",
                        "writeLSN": "0/50D6ED0",
                        "writeLag": "00:00:00.001327",
                        "writeLagBytes": 0,
                        "writeLagSize": "0 bytes"
                    },
                    "summary": "GOOD"
                }
            },
            "lastUpdated": "2022-08-24T05:44:06.112975Z"
        },
        "f50adf6919c047f0b83c2bf73553f962": {
            "clusterSummary": {
                "2dda1a9e4de4438b9fab1c7efc95bea8": {
                    "clusterErrors": [
                        {
                            "errorMessage": "could not connect to the postgresql server in replication mode: timeout expired\n",
                            "errorTime": "2022-08-24T06:20:36.959429Z"
                        }
                    ],
                    "nodeInfo": {
                        "applyRate": 945,
                        "catchupInterval": "00:00:00",
                        "connected": false,
                        "flushLSN": "0/3E1A700",
                        "flushLag": "00:00:00.00079",
                        "flushLagBytes": 0,
                        "flushLagSize": "0 bytes",
                        "host": "54.224.193.194",
                        "isThisNode": false,
                        "nodeID": "2dda1a9e4de4438b9fab1c7efc95bea8",
                        "port": 5432,
                        "publicAddress": "54.224.193.194",
                        "replayLSN": "0/3E1A700",
                        "replayLag": "00:00:00.00079",
                        "replayLagBytes": 0,
                        "replayLagSize": "0 bytes",
                        "replicationBlocked": false,
                        "sentLSN": "0/3E1A700",
                        "sentLagBytes": 0,
                        "sentLagSize": "0 bytes",
                        "state": "streaming",
                        "status": {
                            "code": "d",
                            "description": "down"
                        },
                        "writeLSN": "0/3E1A700",
                        "writeLag": "00:00:00.00079",
                        "writeLagBytes": 0,
                        "writeLagSize": "0 bytes"
                    },
                    "summary": "BAD:node is inactive"
                },
                "f374bb3eab384692843e8dc60a9c2da2": {
                    "clusterErrors": [
                        {
                            "errorMessage": "could not connect to the postgresql server in replication mode: timeout expired\n",
                            "errorTime": "2022-08-24T06:20:36.973754Z"
                        }
                    ],
                    "nodeInfo": {
                        "applyRate": 936,
                        "catchupInterval": "00:00:00",
                        "connected": false,
                        "flushLSN": "0/3E1A700",
                        "flushLag": "00:00:00.000856",
                        "flushLagBytes": 0,
                        "flushLagSize": "0 bytes",
                        "host": "54.144.33.167",
                        "isThisNode": false,
                        "nodeID": "f374bb3eab384692843e8dc60a9c2da2",
                        "port": 5432,
                        "publicAddress": "54.144.33.167",
                        "replayLSN": "0/3E1A700",
                        "replayLag": "00:00:00.000856",
                        "replayLagBytes": 0,
                        "replayLagSize": "0 bytes",
                        "replicationBlocked": false,
                        "sentLSN": "0/3E1A700",
                        "sentLagBytes": 0,
                        "sentLagSize": "0 bytes",
                        "state": "streaming",
                        "status": {
                            "code": "d",
                            "description": "down"
                        },
                        "writeLSN": "0/3E1A700",
                        "writeLag": "00:00:00.000856",
                        "writeLagBytes": 0,
                        "writeLagSize": "0 bytes"
                    },
                    "summary": "BAD:node is inactive"
                },
                "f50adf6919c047f0b83c2bf73553f962": {
                    "clusterErrors": null,
                    "nodeInfo": {
                        "host": "18.234.75.110",
                        "isThisNode": true,
                        "nodeID": "f50adf6919c047f0b83c2bf73553f962",
                        "port": 5432,
                        "publicAddress": "18.234.75.110",
                        "status": {
                            "code": "r",
                            "description": "ready"
                        }
                    },
                    "summary": "GOOD"
                }
            },
            "lastUpdated": "2022-08-24T06:21:05.581794Z"
        }
    }
}

The response displays:

summary of each node
cluster error information, if any
human readable consolidated summary
last updated information

Show all cluster errors

To show a summary of cluster error information, if any, run:

ksctl cluster errors

Join a Node to a Cluster

You can join a node to a cluster in one command or three commands.

During cluster join, the node's data is reset and synchronized with the cluster. This means that restoring a backup to the node is unnecessary.

Note

The ksctl commands to join a node require an IP address, hostname, or FQDN of both the cluster member and of the joining node. These IP addresses and hostnames are used by the nodes to talk to each other, and not by the CLI to talk to the nodes. For example, in AWS deployments, if the nodes are on the same VPC, the internal IP address of the member and joining node should be used, not a public IP.

Additionally, to successfully join a cluster, a CipherTrust Manager's hostname can consist only of lowercase letters, numbers, and hyphens. The default hostname of ciphertrust meets these criteria. The ksadmin user can view and change the hostname with the kscfg utility.

To successfully join a cluster using an FQDN, uppercase letters are not allowed.

Caution

If allowlist is enabled on any node in the cluster, before adding the new node to the cluster, ensure that the new node's IP address is added to the allowlist of nodes where allowlist is active. Refer to cluster allowlist for details.

One Step Cluster Join Using fulljoin

It is simplest to use the fulljoin command to join a node. For the cluster fulljoin command, you are required to input a member's IP address, hostname, or FQDN, the joining node's IP address, hostname, or FQDN, and either the joining node's configuration file, or its username, password, and URL.

To complete the process using the fulljoin command with a configuration file enter the command:

ksctl cluster fulljoin --member=<member_IP_or_hostname_or_FQDN> --newnodehost=<joining_IP_or_hostname_or_FQDN> --newnodeconfig=<config_File_of_Joining_Node>.

As with the individual commands, you must wait until the join operation has completed and the node status is ready before joining another node.

Response:

When you add a node to a cluster, the existing data of the node is deleted. Are you sure want to join? [y/N]

Attempting to get the CSR from the joining node...
Finished getting the CSR from the joining node...
Attempting to get the Certificate and CA Chain from the member node...
Finished getting the Certificate and CA Chain from the member node...
Attemping to join the new node to the member's cluster...

{
  "nodeID": "",
  "status": {
    "code": "creating",
    "description": "joining: creating (1/5)"
  }
}

When a node joins a cluster, the node adopts the credentials of the cluster. Would you like to write the new cluster credentials

to the provided configuration file? [y/N]

y

An example of the cluster fulljoin command without using a configuration file:

ksctl cluster fulljoin --member=<member_IP_or_hostname_or_FQDN> --newnodehost=<joining_IP_or_hostname_or_FQDN> --newnodepass=<joining_node_password> --newnodeuser=<joining_node_user> --newnodeurl=<joining_node_URL>

Response:

When you add a node to a cluster, the existing data of the node is deleted. Are you sure want to join? [y/N]

y

Attempting to get the CSR from the joining node...
Finished getting the CSR from the joining node...
Attempting to get the Certificate and CA Chain from the member node...
Finished getting the Certificate and CA Chain from the member node...
Attemping to join the new node to the member's cluster...

{
  "nodeID": "",
  "status": {
    "code": "creating",
    "description": "joining: creating (1/5)"
  }
}

Three Step Cluster Join

Get the csr from the joining node by entering the command ksctl cluster csr
Get the certificate and CA chain from a member node by entering the command ksctl cluster nodes create
Run the join command from the joining node by entering the command ksctl cluster join
Caution
If you have recently restored a backup on the CipherTrust Manager, make sure the restore operation is complete and all services are up and running. Then, you can join the CipherTrust Manager to a cluster. If you join a cluster before restore is complete, this can result in other nodes in the cluster failing to reboot.
Caution
The join operation might take some time to complete, depending on network speed and database size. Completion times of 30 minutes are not unusual.
While a join is underway, do not restart the system, as this can cause the join to fail, and the node to disappear from the cluster. There is no need to manually restart the system after join is completed as well.
You can always view the node status to check progress while a join is underway. During the join, the node displays a joining status and indicates progress through five stages: creating, bootstrapping, initial sync, catching up, and completing; the down status can also appear for a short time. If there is a failure during joining, the node status is killed or unknown, or a persistent down status.
Check that the join operation has completed and that the node status is ready before joining another node.

Remove a Node from a Cluster

Remove the node from the cluster. A node cannot remove itself, so you must call this on some other node in the cluster:
```
ksctl cluster nodes delete --id=ebab0738-6e09-4b0d-8c99-850d7f24dfac
```
Delete the cluster configuration on the removed node:
```
ksctl cluster delete
```
Note
If ksctl cluster delete doesn't work for any reason it is always possible to perform a full system reset to ensure any left over data is removed from the node. A node can be reset using the ksctl services reset command.

Warning

Refer to section Services Reset for important information on using this command.

Rejoin a Node to a Cluster

Before rejoining it is recommended to reset the node.

To reset the node enter the command:

$ ksctl services reset

Warning

Refer to section Services Reset for important information on using this command.

Join the node as described in join a node to a cluster.

Create a New Cluster from a Removed Node

Ensure the node have been successfully removed from the cluster and that ksctl cluster delete has been performed, as described in remove a node from a cluster.

Create a new cluster as described in create a new cluster.

Cluster Allowlist

The CipherTrust Manager introduces a feature to configure allowlist for the cluster. The allowlist is a list of IPs that are allowed to connect to the cluster port.

To enable this feature, you need to add at least one IP to the allowlist. By default, this feature is disabled.

Note

Only IPs (IPv4, IPv6) are allowed in the allowlist.

Caution

When a public IP is used in the "--host" field while creating a cluster, that IP must be added to the allowlist of the current node.

Configure Cluster Allowlist

Note

The cluster allowlist can only be configured through kscfg.
The CipherTrust Manager GUI only allows to view allowlist for each node, if enabled. To do so:
1. Log on to CipherTrust Manager GUI.
2. Go to Admin Settings > Cluster.
3. In the Cluster window, click the ellipsis on the right side of a node and select View Node Allowlist.
  It displays the allowed IP list for that specific node.

To configure, SSH to CipherTrust Manager with ksadmin user.

The kscfg commands are as follows:

add: adds IPv4 and IPv6 address to cluster allowlist
list: lists all IPs in the cluster allowlist
remove: removes IPv4 and IPv6 address from the cluster allowlist

Add IPs to allowlist

To add an IP to allowlist, run:

Syntax

kscfg cluster allowlist add --ip <node-ip>

Example Request

kscfg cluster allowlist add --ip 3.94.160.118

There will be no response if node is added successfully in a cluster.

Note

You can add only one node IP at a time in the cluster allowlist.

Fetch list of IPs added to allowlist

To get the list of IPs added to allowlist, run:

Syntax

kscfg cluster allowlist list

Example Request

kscfg cluster allowlist list

Example Response

[
"3.94.160.118"
]

Remove IPs from allowlist

To remove an IP from allowlist, run:

Syntax

kscfg cluster allowlist remove --ip <node-ip>

Example Request

kscfg cluster allowlist remove --ip 3.94.160.118

There will be no response if node is removed successfully from the cluster.

Clusters and Nodes

Overview

Cluster Disconnection

Initial Monitoring and Troubleshooting

Recover from Permanent Disconnection

Cluster Size and Performance

What is not clustered?

Managing Clusters and Nodes through UI

Creating a New Cluster Configuration

Joining a Node to a Cluster

Removing a Node from a Cluster

Re-joining a Node to a Cluster

Deleting a Cluster

Viewing Cluster Node Status

Managing Clusters and Nodes through ksctl CLI

Check the Status of a Cluster

Create a New Cluster Configuration

Show the Status of All Nodes in the Cluster

Show the Status of a Single Node in the Cluster

Show a summary of all nodes in the cluster

Show all cluster errors

Join a Node to a Cluster

One Step Cluster Join Using fulljoin

Three Step Cluster Join

Remove a Node from a Cluster

Rejoin a Node to a Cluster

Create a New Cluster from a Removed Node

Cluster Allowlist

Configure Cluster Allowlist

Add IPs to allowlist

Fetch list of IPs added to allowlist

Remove IPs from allowlist

On this page

Suggest A Change

Clusters and Nodes

Overview

Cluster Disconnection

Initial Monitoring and Troubleshooting

Recover from Permanent Disconnection

Cluster Size and Performance

What is not clustered?

Sharing HSM Root of Trust Configuration

Managing Clusters and Nodes through UI

Creating a New Cluster Configuration

Joining a Node to a Cluster

Removing a Node from a Cluster

Re-joining a Node to a Cluster

Deleting a Cluster

Viewing Cluster Node Status

Managing Clusters and Nodes through ksctl CLI

Check the Status of a Cluster

Create a New Cluster Configuration

Show the Status of All Nodes in the Cluster

Show the Status of a Single Node in the Cluster

Show a summary of all nodes in the cluster

Show all cluster errors

Join a Node to a Cluster

One Step Cluster Join Using fulljoin

Three Step Cluster Join

Remove a Node from a Cluster

Rejoin a Node to a Cluster

Create a New Cluster from a Removed Node

Cluster Allowlist

Configure Cluster Allowlist

Add IPs to allowlist

Fetch list of IPs added to allowlist

Remove IPs from allowlist

On this page