Download and Compile Code Using Github
Clone the repository to your local machine.
git clone https://github.com/ThalesGroup/CipherTrust_Application_Protection.git
The database directory has all the code for Databricks.
Assuming that your CipherTrust Manager is already configured, each of the classes will have a main method which can be used to test locally. Make sure your CipherTrust Manager environment is configured correctly.
Modify the
udfConfig.properties
file values such asCMUSERID
,CMPWD
, and other necessary settings. For more information onudfConfig.properties
, see Configuration File and Environment Variables.
If you have already installed CM then you need to update the CADP_for_JAVA.properties
file with all the required settings like IP or NAE Port. The file is located in the resource directory in the java project for eclipse.
Generate the Jar File to Upload to the Databricks
To compile and generate the target jar file to be uploaded to the Databricks compute platform select the project and choose Run As > maven install to generate the target.
[INFO]
[INFO] --- maven-install-plugin:2.4:install (default-install) @ Thales-Databricks-UDF ---
[INFO] Installing C:\Users\t0185905\workspace\Thales-Databricks-UDF\target\Thales-Databricks-UDF-6.0-SNAPSHOT.jar to C:\Users\t0185905\.m2\repository\Thales\Thales-Databricks-UDF\6.0-SNAPSHOT\Thales-Databricks-UDF-6.0-SNAPSHOT.jar
[INFO] Installing C:\Users\t0185905\workspace\Thales-Databricks-UDF\pom.xml to C:\Users\t0185905\.m2\repository\Thales\Thales-Databricks-UDF\6.0-SNAPSHOT\Thales-Databricks-UDF-6.0-SNAPSHOT.pom
[INFO] Installing C:\Users\t0185905\workspace\Thales-Databricks-UDF\target\Thales-Databricks-UDF-6.0-SNAPSHOT-jar-with-dependencies.jar to C:\Users\t0185905\.m2\repository\Thales\Thales-Databricks-UDF\6.0-SNAPSHOT\Thales-Databricks-UDF-6.0-SNAPSHOT-jar-with-dependencies.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.482 s
[INFO] Finished at: 2024-10-17T10:28:59-04:00
Configuration File and Environment Variables
The following sample file uses the udfConfig.properties
file in the java project resource directory that must be copied to a file location in databricks.
Sample udfConfig.property
file:
# udfConfig.properties
CMUSER=apiuser
CMPWD=yorupwde!
returnciphertextforuserwithnokeyaccess=yes
CADPIP=yourcadpip
keymetadatalocation=external
keymetadata=1001000
protection_profile=alpha-external
protection_profile_alpha_ext=alpha-external
protection_profile_alpha_int=plain-alpha-internal
protection_profile_nbr_int=plain-nbr-internal
protection_profile_nbr_ext=plain-nbr-ext
showrevealinternalkey=yes
BATCHSIZE=20
CADPUSER=admin
Listed below are the explanations for the configuration file variables required for the Cloud Function.
Key | Value | Description |
---|---|---|
BATCHSIZE | 200 | Nbr of rows to chunk when using batch mode |
CMPWD | Yourpwd! | CM PWD if using CADP |
CMUSER | apiuser | CM USERID if using CADP |
datatype | charint | datatype of column (char or charint) |
mode | encrypt | mode of operation (protect,reveal,protectbulk,revealbulk) CADP |
The UDF_CONFIG_VOLUME_PATH
Cluster Environment Variable is required for the udfConfig.properies
file. It is set in the Advanced Options of the cluster. Ensure to locate it on a Cluster Volume.