Automating and Reducing the Duration of Initial Data Transformation
CTE for Teradata allows Teradata customers to run their database services on top of a high performing encrypted storage infrastructure, solving many security and compliance concerns. Teradata systems are usually mission critical and cannot afford to be offline for significant periods of time. The problem, therefore, is how to convert an existing Teradata system from one that is unencrypted, to one that is encrypted. Thales calls this conversion process rekeying, or data transformation.
The following steps outline the process the Teradata operations team should follow to encrypt the platform while minimizing system downtime and operational disruptions. The process leverages the fact that a Teradata system is composed of multiple disks, which create a redundant platform, and effectively approaches the conversion process by breaking it into smaller, more manageable, and less disruptive steps. Operators should prepare for data transformation and schedule time to complete the transformation process while the Teradata database services are down. The process supports recovery from unexpected IO failure during the transformation process of migrating data from unencrypted to encrypted. This document describes the procedure and execution of the transformation process using a script that orchestrates the entre transformation process across multiple nodes in a clique, and significantly reduces the entire data transformation time.
Warning
Thales recommends that you backup the Teradata database before starting the transformation process.
Use the cte-pdisk-idt.sh
script to perform the initial rekey using embedded metadata on the Teradata pdisks
that will be configured for IDT using external metadata. This reduces the time needed to encrypt the pdisks
because the external metadata is a bottleneck when performing rekey on multiple concurrent devices. The script works by saving the last 63 MB of a device to a file, and then configuring the device as IDT with embedded metadata. The last 63 MB is saved because devices configured for IDT with embedded metadata relocate the first 63 MB to the end of the device to make space for the metadata. Upon completion of rekey, the script then restores the first 63 MB from the end of the device and the last 63 MB for the saved file before converting the device to IDT with external metadata.
Note
-
This script only works for CTE 7.3.0 or subsequent versions
-
This script only works for initial rekey.
Getting Help
Recovery from interruption during the preparation step requires manual intervention. Recovery from unexpected failure occurs when the GuardPoints are guarded again. You can then restart the script in the monitor mode to resume monitoring the transformation progress.
In the event that the script is interrupted, or a system crash, contact Thales Support. If possible, include the following information to facilitate recovery:
-
Output of the script
-
Output of the
secfsd -status guard
command (if the interruption did not crash the system) -
Contents of the the
/tmp/pdisks.list
file -
Contents of the
/tmp/monitor-idt.disks
file -
Contents of the
/opt/teradata/vormetric/agent/secfs/.sec/conf/sod_config file
-
Contents of the
/var/opt/teradata/vormetric/vte-metadata-dir
directory (including file sizes)
Prerequisites
-
Your policy for the external metadata directory must allow read/write access for the
dd
,mv
,rm
, andstat
functions. -
Your policy for the raw device must allow read/write access with apply key effect for the
dd
function.
Install and Setup
-
Install the agent in
/opt/teradata/vormetric
and setup the external metadata directory/var/opt/teradata/vormetric/vte-metadata-dir
on all nodes in the clique.- See Install CTE on the Teradata Database Appliance for more information on installing CTE on Teradata.
-
Copy the
cte-pdisk-idt.sh
script from/opt/teradata/vormetric/agent/secfs/.sec/bin/
to/root
. -
Shutdown Teradata, type:
tpareset -x -f 'Setting up CTE'
-
Generate a list of disks to encrypt, type:
ls -l /dev/pdisk/dsk* | awk '{ print $(NF) }' > /tmp/all_pdisks.list
Note
This command generates a file with a list of disks in the format:
/dev/disk/by-id/*
. -
Divide
/tmp/all_pdisks.list
into separate files based on the number of active (non-HSN) nodes in the clique. For example, if the list contains 300 disks and there are 4 active nodes, there should be 4 files with 75 disks in each. -
Copy each file as
/tmp/pdisks.list
to the corresponding host that will oversee encrypting those disks. For example, if there are 300 disks in a 4-node clique, each node should have a unique list of 75 disks. -
Add a manual IDT GuardPoint on each node for the disks in
/tmp/pdisks.list
Note
The script only toggles the GuardPoints in the
/tmp/pdisks.list
file so it is safe to add manual GuardPoints for all of the devices on all of the nodes. However, automated metadata distribution does not work if the GuardPoint is present on all nodes as it expects the node performing transformation to be the only node with the GuardPoint at this point.
Configure and Initialize
-
To configure the devices for IDT on each node, save the last 63 MB, and start the initial rekey, type:
/root/cte-pdisk-idt.sh -m pre -y -f /tmp/pdisks.list
-
To monitor transformation with the following and wait for transformation to complete, type:
/root/cte-pdisk-idt.sh -m mon
-
To finish the setup, type:
/root/cte-pdisk-idt.sh -m post -f /tmp/pdisks.list
This restores the data to the first and last 63 MB of the device, restores the external metadata, and sets the
pdisk symlinks
to their associatedsecvm
device. -
Verify that identical metadata is present for all devices on all of the nodes, type:
ls -1 /var/opt/teradata/vormetric/vte-metadata-dir | wc -l
If identical metadata is not present on all of the nodes in the same clique, manually synchronize the contents of the external metadata directory.
-
Unguard all of the nodes from the security server and switch to auto-guard GuardPoints.
-
Use the Teradata PUT tool to update the
pdisk
symbolic links permanently. -
Start Teradata to verify that the database can start, type:
/etc/init.d/tpa start
-
Clean up the files created by the script, type:
rm -f /tmp/all_pdisks.list /tmp/pdisks.list.done /tmp/cte_pdisk_idt_pre.out
Script Help Output
/root/cte-pdisk-idt.sh -h
Response
Usage /root/cte-pdisk-idt.sh [-h] [-y] [-t #] [-m <mon|pre|post>]
< [-p </dev/pdisk/dsk#>] </dev/disk/by-id/scsi-...>
-f <batch_file> >
-f <batch_file> File with list of disks to transform
File should have one entry per line
Format is,
/dev/disk/by-id/tdmp-xxx
/dev/disk/by-id/tdmp-yyy
...
or
/dev/pdisk/dsk# -> /dev/disk/by-id/tdmp-xxx
/dev/pdisk/dsk# -> /dev/disk/by-id/tdmp-yyy
...
Format for the latter option can be generated with,
ls -l /dev/pdisk/dsk* | awk '{ print $(NF - 2), \
$(NF - 1), $NF }'
-m mon Monitors IDT status on disks undergoing transformation started with the '-m pre' option
-m full Performs the entire transformation process (default)
-m pre Performs steps to set up and begin encryption of a device
-m post Performs steps after device has been encrypted to restore borrowed space
With the 'full' mode, the script will wait while the device undergoes encryption. The 'pre' mode will start
encryption and then terminate the script and wait for a user to rerun the script with the 'post' mode after encryption is complete.
-t # Number of threads to use during IDT (default: 16, range: 1 - 60)
-p Pdisk to link to after transformation is complete
-y Skips user prompts
-h Displays this help