Configuring Hadoop to Use CTE
-
On the HDFS node, type:
/opt/vormetric/DataSecurityExpert/agent/secfs/hadoop/bin/config‐hadoop.sh –i.
HDFS prompts you for the following information:
-
Hadoop product name: (i.e. hdp)
-
Hadoop product version: (i.e. 2.6.0.2.2.0.0‐2041)
-
Path to JAVA_HOME used by Hadoop: (i.e. /usr/jdk64/jdk1.8.0_40)
Note
Alternatively, you can use the automated installation option:
/opt/vormetric/DataSecurityExpert/agent/secfs/hadoop/bin/config-hadoop.sh -i -p hdp -v 2.6.0.2.2.0.0-2041 -j /usr/jdk64/jdk1.8.0_40 2. -
-
Verify the configuration the using the
‐s
option:/opt/vormetric/DataSecurityExpert/agent/secfs/hadoop/ bin/config-hadoop.sh -s Vormetric-Hadoop Configuration Status PRODUCT_NAME=hdp- PRODUCT_VERSION=3.0.0.0-2557 HADOOP_HOME=/usr/hdp/current/hadoop-client/sbin/../- HADOOP_VERSION=2.7.1 HADOOP_PRODUCT_VERSION=2.7.1.2.3.0.0-2557- HADOOP_VERSION_MAJOR=2.7 LIBVORHDFS_SO=/usr/hdp/current/hadoopclient/sbin/..//lib/native/libvorhdfs.so LIBHDFS_SO=/etc/vormetric/hadoop/lib/libhdfs.so VORMETRIC_HADOOP=/etc/vormetric/hadoop- #-----------vormetric-------------------------------------------- export HADOOP_NAMENODE_OPTS="-javaagent:/etc/vormetric/hadoop/jar/vormetric-hdfs-agent.jar=voragent ${HADOOP_NAMENODE_OPTS}"v6 . . . . . 62 export HADOOP_DATANODE_OPTS="-javaagent:/etc/vormetric/hadoop/jar/vormetric-hdfs-agent.jar=voragent ${HADOOP_DATANODE_OPTS}" #--------------vormetric---------------------------------------- /etc/vormetric/hadoop/lib/libhdfs.so ...ok /usr/hdp/current/hadoop-client/sbin/..//lib/native/libvorhdfs.so ...ok /etc/vormetric/hadoop/gen-vor-hadoop-env.sh ...ok /etc/vormetric/hadoop/vor-hadoop.env ...ok Looks ok. Vormetric Transparent Encryption Agent 6.0.3 Installation and Configuration Guide v6 . . . . .62 export HADOOP_DATANODE_OPTS="-javaagent:/etc/vormetric/hadoop/jar/vormetric-hdfs-agent.jar=voragent ${HADOOP_DATANODE_OPTS}" #--------------vormetric -------------------------------------- /etc/vormetric/hadoop/lib/libhdfs.so ...ok /usr/hdp/current/hadoop client/sbin/..//lib/native/libvorhdfs.so ...ok /etc/vormetric/hadoop/gen-vor-hadoop-env.sh ...ok /etc/vormetric/hadoop/vor-hadoop.env ...ok Looks ok
Verify secfsd is Running with Hadoop Environment
Use a text editor to view the /etc/init/secfsd‐upstart.conf
file. The file should contain env entries, type:
cat /etc/init/secfsd‐upstart.conf
HDFS Name Cache
Obtaining the HDFS file name from the NameNode is network intensive, so the map from HDFS block file name to HDFS file name is cached in a hash table. The following secfs configuration variables are used to support the hash cache. They are provided in case you need to tune the memory management of the name cache for better performance.
-
hdfs_cache_entry: Default is 1,024.000, which could cover up to 125TB HDFS data because the default HDFS block size is up to 128MB (128MB * 1024000 = 125TB).
-
hdfs_cache_bucket: Default is 10240.
-
hdfs_cache_timeout: Default to 30 minutes.
-
hdfs_cache_interval: Default wake up interval for a worker thread to update the cache entry whose timeout has expired is 10 seconds.
On Linux, you can configure each secfs configuration variable using the voradmin commands:
For example, to configure the variable, type:
voradmin secfs config hdfs_cache_interval 30
Note
Because HDFS rename is a metadata operation inside the HDFS NameNode and does not call into the local file system, the hash cache might contain expired data. The HDFS NameNode is coded to prevent renamed data from crossing a key boundary to prevent data corruption. However, other access checks based on the HDFS file name may give incorrect information if the name is expired. Understand this risk before using this feature.
Enabling CTE on HDFS
To enable CTE on your HDFS:
-
Restart CTE agent on the node.
-
For Redhat 6.x, type:
/etc/init.d/secfs restart
-
For Redhat 7.x or 8.x, type:
/etc/vormetric/secfs restart
-
-
Restart the Hadoop Services in the cluster.
You can now create GuardPoints to protect your entire HDFS.