Deploying CCC in High Availability Configurations
This page demonstrates how to implement a sample configuration for deploying CCC in a high availability (HA) setup. CCC offers an open and flexible architecture that supports various configuration options, including high availability. This guide covers everything from optimizing your HA setup, file sharing, and validating your HA configuration to installing the server OS, configuring the network, setting up PostgreSQL servers in HA mode, and configuring the CCC application server. Follow these steps to ensure a robust and resilient CCC deployment:
Evaluate system scalability and resilience
Explore the figure below illustrating a comprehensive CCC configuration designed for high availability. Identify key components and evaluate how they contribute to the system's resilience and scalability. Compare this setup with your current infrastructure to gauge its suitability for meeting your operational needs. Consider implementation challenges and scalability aspects to ensure a robust deployment aligned with your organization's requirements.
To use the HA configuration, you need to use an external database.
Optimize your configuration for HA
By adding additional and standby components to the basic configuration, you can create an HA system that offers the following features:
Server redundancy
A true high-availability system must continue operating without interruption if one of the system components fails. To achieve this capability, it is recommended to set up the following components:
-
Two worker nodes running CCC servers
-
One master node for Kubernetes
-
Two PostgreSQL database servers
-
Two NFS servers
Load balancing
The Kubernetes master node handles load balancing by forwarding requests to all CCC servers. Each request is taken in sequence and routed to one of the worker nodes running CCC servers. If a CCC server goes down, the request is seamlessly forwarded to the next available server, with no impact on the end user. Kubernetes is configured to support persistent (or sticky) sessions, ensuring that all requests for a specific user session are directed to the same CCC server. In the event of a failover, the user will be prompted to log in to another active server, and all subsequent requests will be directed to the new server for the remainder of the session.
Data replication
It is recommended to configure the PostgreSQL database for streaming replication between the active and standby database servers. This setup ensures uninterrupted access to the data stored in the CCC database in the event of a failure. Any HA implementation should include this or a similar feature. Similarly, configure the NFS server for streaming replication of CCC data between the active and standby NFS servers. This ensures continuous access to the data stored on the NFS server in case of a failure. Any HA implementation should include this or a similar feature.
Failover protection
Kubernetes automatically redirects users to an active CCC server if one of the CCC servers on a worker node goes down. To manage database failover, you can install an application like keepalived on each database server. This allows you to use a virtual IP address to identify the active database server. If the active server goes down, the standby server assumes the virtual IP address and becomes the active server. Additionally, notification of a failover is necessary to re-synchronize the databases when the failed server is brought back online. Similarly, NFS server failover is managed by installing keepalived on each NFS server. This setup uses a virtual IP address to identify the active NFS server. If the active server goes down, the standby server takes over the virtual IP address and becomes the active server.
Operations in progress during a failover may fail.
Set up file sharing
To set up file sharing among CCC servers, follow these steps on each NFS server:
Run the enableNFSSharing.sh
script on the selected NFS server. This script is available at the following path on a CCC server: CCC_packages/kubernetes/enableNFSSharing.sh
. Use the command ./enableNFSSharing <NFSOption> <IPAddress(s)>
to execute the script.
Valid values for <NFSOption>
are 1
for NFS Server and 2
for NFS Client.
If you run the enableNFSSharing.sh
script without any arguments or with incorrect arguments, you will see the following message: Usage: enableNFSSharing NFSOption[1: For NFS Server 2: For NFS Client] IPAddress
. Ensure you enter valid IP addresses to avoid errors.
Navigate to the folder where the enableNFSSharing.sh
script is located on the NFS server and run the following command: ./enableNFSSharing.sh 1 Server.List
. In this context, Server.List
is a list of all CCC servers to be set up in HA mode as NFS clients, separated by spaces. For example, you might use ./enableNFSSharing.sh 1 20.10.10.10 30.10.10.10
.
Run the following script to set up the NFS client on each CCC server: ./enableNFSSharing.sh 2 IP
. Here, IP
refers to the IP address of the NFS server or the virtual IP address of the NFS server cluster if CCC is set up in high availability. For example, you might use ./enableNFSSharing.sh 2 10.10.10.10
.
Unmount NFS directory during CCC uninstall
To unmount the NFS directory during uninstallation, the CCC administrator should run the following command as the root user:
umount -f -l /home/ccc/packages
This step should be performed only if CCC is set up in HA mode and should be executed on all CCC servers that have been set up as NFS clients.
Validate HA configuration
This section provides an overview of the HA configuration validated by the engineering team.
Master node
The Kubernetes master node serves as the control panel for the CCC cluster on Kubernetes. It manages requests, maintains desired replicas, handles load balancing, ensures failover protection, manages persistent volumes, and configures network settings for the cluster.
Worker nodes (Redundant CCC application servers)
CCC applications are deployed on two separate worker nodes. These nodes receive requests from the master node, which distributes them to individual CCC servers. In case one server becomes unavailable, requests are automatically redirected to the other available server.
PostgreSQL servers
PostgreSQL is installed on two distinct Linux workstations and configured with streaming replication. Each database server is equipped with keepalived for failover and notification mechanisms.
Install server OS and configure network
An HA deployment requires seven separate servers that exchange data over the network to operate as a unified system. When configuring CCC in HA, you need to specify the network address (IP address or hostname) of specific servers in the deployment. To simplify deployment and avoid potential misconfigurations, follow these steps:
Install CentOS 7: Begin by downloading CentOS 7 from CentOS Download Page. Opt for the Minimal installation option to reduce security risks by installing only necessary software. After installation, reboot the system using systemctl reboot
and then update with yum update
to ensure you have the latest updates.
Configure IP Addresses or Hostnames: After the installation, configure static IP addresses or hostnames for each server. This step is crucial for identifying and accessing servers within the deployment network. Refer to CentOS documentation for detailed guidance on configuring network settings.
Record Server Details: Once OS installation and network configuration are complete on each server, record the IP addresses or hostnames in the table below. This table will serve as a reference for subsequent configuration tasks, ensuring accurate setup and management of your CCC HA deployment.
Ensure you reserve two additional IP addresses: one for the database cluster and one for the NFS cluster. These IP addresses need to be specified during database configuration.
Server | Alias | IP Address/Host Name |
---|---|---|
Primary PostgreSQL Server | db_primary_IP_or_hostname | [Fill in IP or Hostname] |
Standby PostgreSQL Server | db_standby_IP_or_hostname | [Fill in IP or Hostname] |
Keepalived Database Cluster VIP | keepalived_db_virtual_IP | [Fill in IP] |
Primary NFS Server | nfs_primary_IP_or_hostname | [Fill in IP or Hostname] |
Standby NFS Server | nfs_standby_IP_or_hostname | [Fill in IP or Hostname] |
Keepalived NFS Cluster VIP | keepalived_nfs_virtual_IP | [Fill in IP] |
CCC Server 1 (Worker Node 1) | ccc1_IP_or_hostname | [Fill in IP or Hostname] |
CCC Server 2 (Worker Node 2) | ccc2_IP_or_hostname | [Fill in IP or Hostname] |
Kubernetes Master Node | Master_IP_or_hostname | [Fill in IP or Hostname] |
Configure and set up PostgreSQL server in HA mode
This section describes how to set up a PostgreSQL HA cluster configuration consisting of a primary PostgreSQL server and a standby PostgreSQL server using streaming replication. Follow these steps to set up your PostgreSQL HA cluster:
This procedure assumes the use of PostgreSQL 14. For more detailed information, refer to the PostgreSQL documentation at PostgreSQL 14 Documentation.
Install PostgreSQL
You will need two standalone Linux servers: one for the primary server and one for the standby server. The tested and documented configuration uses CentOS 7 and PostgreSQL 14. Other operating systems may work, although they are not tested and may have different paths for some components.
Configure the primary PostgreSQL database server
To configure the primary PostgreSQL database server:
Log in as root to the server you identified as the primary PostgreSQL server and set the permissions for the session:
su root
umask 0022
Edit the PostgreSQL configuration file located at /var/lib/pgsql/14/data/postgresql.conf
to uncomment and update the following entries, which are used to configure PostgreSQL for streaming replication:
listen_addresses = '<db_primary_IP_or_hostname>,<keepalived_virtual_IP>'
ssl = on
wal_level = hot_standby
checkpoint_segments = 32
archive_mode = on
archive_command = 'cp %p /tmp/%f'
max_wal_senders = 3
wal_keep_segments = 32
listen_addresses
is a comma-separated list of the hosts the server will respond to. Ensure it includes the keepalived virtual IP address for the database cluster and any other required servers.
Access to the database is controlled by the pg_hba.conf
file, which will be explained in the next step.
Edit the /var/lib/pgsql/14/data/pg_hba.conf
file and add entries for the standby server and other necessary hosts:
host replication all <standby_server_IP> md5
host all all <network_address/mask> md5
Apply the configuration changes:
systemctl restart postgresql-14
Log in to PostgreSQL and create a replication user:
sudo -u postgres psql
CREATE ROLE replication WITH REPLICATION PASSWORD '<password>' LOGIN;
Use pg_basebackup
to create a base backup for the standby server:
sudo -u postgres pg_basebackup -h <db_primary_IP_or_hostname> -D /var/lib/pgsql/14/data -P -U replication --wal-method=stream
Configure the standby PostgreSQL database server
To configure the standby PostgreSQL database server:
Log in as root and set permissions:
su root
umask 0022
Stop PostgreSQL service.:
systemctl stop postgresql-14
Remove existing data:
rm -rf /var/lib/pgsql/14/data/*
Use rsync
to copy the base backup from the primary server:
rsync -av /path/to/base/backup/* /var/lib/pgsql/14/data/
Add the following lines to a new recovery.conf
file in the PostgreSQL data directory:
standby_mode = 'on'
primary_conninfo = 'host=<db_primary_IP_or_hostname> port=5432 user=replication password=<password>'
trigger_file = '/tmp/postgresql.trigger.5432'
Start PostgreSQL:
systemctl start postgresql-14
Check the replication status on the primary server:
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"
Configure client authentication and SSL
To configure client authentication and SSL:
Open the /var/lib/pgsql/14/data/pg_hba.conf
file and add the following lines at the beginning of the #IPv4 local connections section to allow connections from the CCC host:
host replication replicator <db_secondary_IP_or_hostname>/32 md5
hostssl lunadirectordb lunadirector <ccc1_IP_or_hostname>/32 md5
hostssl lunadirectordb lunadirector <ccc2_IP_or_hostname>/32 md5
If both CCC servers are in the same subnet, you can add a single line to allow access from all devices in that subnet: hostssl lunadirectordb lunadirector <subnet>/24 md5
.
Save and close the configuration file. Then restart the PostgreSQL service to apply the changes:
systemctl restart postgresql-14.service
Create a self-signed SSL certificate. This ensures secure connections between the CCC application servers and the PostgreSQL servers:
cd /var/lib/pgsql/14/data
openssl req -new -text -out server.req -nodes
openssl rsa -in privkey.pem -out server.key
rm -f privkey.pem
openssl req -x509 -in server.req -text -key server.key -out server.crt
chmod og-rwx server.key
chown postgres:postgres server.key
systemctl restart postgresql-14
When prompted for the certificate attributes, the only important attribute is the Common Name (CN), which must be set to the virtual IP of the database cluster.
Set up users and configure firewall
To set up users and configure firewall:
Create replication and CCC users. In addition, create database and assign ownership:
su - postgres -c "psql postgres postgres -c \"CREATE USER replicator REPLICATION LOGIN PASSWORD 'dbpass';\""
su - postgres -c "psql postgres postgres -c \"CREATE USER lunadirector encrypted PASSWORD 'password';\""
su - postgres -c "psql postgres postgres -c \"CREATE DATABASE lunadirectordb OWNER lunadirector;\""
Remember the password for the lunadirector
user. It will be needed later when configuring CCC.
Configure firewall. Allow the CCC servers to access the PostgreSQL database on port 5432:
iptables -I INPUT 2 -p tcp -m tcp --dport 5432 -j ACCEPT
Test the PostgreSQL database cluster
To verify that streaming replication is configured correctly, follow these steps to create a table on the primary database and confirm it is replicated on the standby database:
Execute the following command to create a table named test
on the primary database:
su - postgres -c "psql postgres postgres -c \"CREATE TABLE test (name char(10));\""
Verify Replication on the Standby Database:
a. Log in to the standby database server as root:
su root
b. Start PostgreSQL:
systemctl start postgresql-14
c. Connect to the PostgreSQL database:
su - postgres
psql -d postgres
d. List the tables in the database:
\dt *.*
If streaming replication is configured correctly, the test
table should be listed in the output. If it is not, check your configuration and try again.
Remove the test
table from the primary database using the following command:
su - postgres -c "psql postgres postgres -c \"DROP TABLE test;\""
Execute the following command to attempt creating a table named test
on the standby database:
su - postgres -c "psql postgres postgres -c \"CREATE TABLE test (name char(10));\""
Ensure the command fails with the following error:
ERROR: cannot execute CREATE TABLE in a read-only transaction
This error confirms that the standby database is in read-only mode, as expected.
Set up keepalived on the PostgreSQL servers
You must install the keepalived software on each PostgreSQL server to manage failover to the standby server in the event of an outage on the primary server. Once keepalived is installed and configured, if the primary server goes down, the standby server takes over as the new primary server, and the old primary server becomes the standby server. Keepalived allows you to configure a virtual IP address for the database cluster, so that database failover is transparent. For more information, refer to the keepalived documentation, available at keepalived.org. Here are the installation and configuration steps:
Install keepalived on both PostgreSQL database servers:
yum install keepalived
Edit the /etc/keepalived/keepalived.conf
file on the primary server:
! Configuration File for keepalived
global_defs {
notification_email {
}
notification_email_from <email_address>
smtp_server <smtp_server_IP_or_hostname>
smtp_connect_timeout 30
router_id CCC_DB_MONITOR
}
vrrp_instance VI_1 {
state MASTER
interface <eth0 | eth1 | eth2 | ...>
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass PASSWORD
}
virtual_ipaddress {
<db_cluster_virtual_IP> dev <eth0 | eth1 | eth2 | ...>
}
}
Replace placeholders with appropriate values:
-
<email_address>
: The email address for notification messages. -
<smtp_server_IP_or_hostname>
: The IP address or hostname of the SMTP server. -
<db_cluster_virtual_IP>
: The virtual IP address for the database cluster. -
<eth0 | eth1 | eth2 | ...>
: The network interface to bind to the virtual IP address. Use the current interface unless a different one is preferred.
Replace all existing content in the file with the provided configuration.
Edit the /etc/keepalived/keepalived.conf
file on the standby server:
! Configuration File for keepalived
global_defs {
notification_email {
}
notification_email_from <email_address>
smtp_server <smtp_server_IP_or_hostname>
smtp_connect_timeout 30
router_id CCC_DB_MONITOR
}
vrrp_instance VI_1 {
state BACKUP
interface <eth0 | eth1 | eth2 | ...>
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass PASSWORD
}
virtual_ipaddress {
<db_cluster_virtual_IP> dev <eth0 | eth1 | eth2 | ...>
}
notify_master /root/<path_to_notify_master_script>
notify_backup /root/<path_to_notify_backup_script>
}
Replace placeholders with appropriate values:
-
<email_address>
: The email address for notification messages. -
<smtp_server_IP_or_hostname>
: The IP address or hostname of the SMTP server. -
<db_cluster_virtual_IP>
: The virtual IP address for the database cluster. -
<eth0 | eth1 | eth2 | ...>
: The network interface to bind to the virtual IP address. -
<path_to_notify_master_script>
: The path to the notify master server script. -
<path_to_notify_backup_script>
: The path to the notify backup server script.
Replace all existing content in the file with the provided configuration.
Configure the firewall (iptables) to allow multicast on both PostgreSQL database servers:
iptables -I INPUT -i eth2 -d 224.0.0.0/8 -j ACCEPT
Start keepalived on both PostgreSQL database servers, beginning with the primary:
systemctl start keepalived
Restart PostgreSQL on both PostgreSQL database servers, beginning with the primary:
systemctl restart postgresql-14
Test keepalived
You can verify that keepalived is working by performing the following tasks on both database servers:
View the logs in /var/log/messages
. Check for any messages related to keepalived. This can help you identify if there are any issues or errors in the setup.
Run the following command to see if the virtual IP is bound where you expect it to be. In normal operation, the virtual IP should be bound to the primary database server only:
ip addr show <eth0 | eth1 | eth2 | ...>
Replace <eth0 | eth1 | eth2 | ...>
with the appropriate network interface you are using for the virtual IP.
Set up and configure CCC application server
Setting up and configuring the CCC application servers involves performing the following tasks on each server used to host the CCC application. To set up and configure the CCC servers:
Install CCC using Kubernetes. Follow the instructions in the Install CCC using Kubernetes guide. During the installation, provide the IP address for your PostgreSQL HA setup in the configuration file (config-map.yaml
).
Replicate the CCC application. Run the following command to scale the CCC application deployment:
kubectl scale --replicas=2 deployment ccc-deployment
This command runs two CCC instances, one on each of the worker nodes.
Follow CCC container logs. To monitor the CCC container logs, run the following command on the master node:
kubectl logs –f <pod-name>
Access CCC. Once the installation is complete on both pods, you can log in to CCC using one of the following options:
-
https://master_node_ip
-
https://master_hostname:30036