Detokenization.properties File
Detokenization.properties
file explains all the parameters required to run the CT-V Bulk Utility for detokenization of tokens in File-to-File operation using token vault. It allows the user to configure the input and output files.
Below is a sample parameters file that can be used as a template:
###############################################################################
# CipherTrust Vaulted Tokenization Bulk Migration Configuration
#
# To run migration use the following command
#
# java com.safenet.token.migration.main config-file-path -d
#
# Note: This is a sample file and needs to be customized to your specific
# environment
# ###############################################################################
#####################
# Input Configuration
# Input.FilePath
# Input.Type
#####################
#
# Input.FilePath
#
# Full path to the input file
#
Input.FilePath = C:\\Desktop\\migration\\tokenized.csv
#
# Input.Type
#
# Format of the input file
#
# Valid values
# Delimited
# Positional
#
Input.Type = Delimited
###############################
# Delimited Input Configuration
# Input.EscapeCharacter
# Input.QuoteCharacter
# Input.ColumnDelimiter
###############################
#
# Input.EscapeCharacter
#
# Specifies a character that is used to 'escape' special characters that
# alter input processing
#
# Note: this parameter is ignored if Input.Type is set to Positional
#
Input.EscapeCharacter = \\
#
# Input.QuoteCharacter
#
# Specifies a character that is used around character sequences that contain
# delimiter characters and are to be treated as a single column value
#
# Note: this parameter is ignored if Input.Type is set to Positional
#
Input.QuoteCharacter = "
#
# Input.ColumnDelimiter
#
# Specifies a character that separates columns in the input file
#
# Note: this parameter is ignored if Input.Type is set to Positional
#
Input.ColumnDelimiter = ,
################################
# Positional Input Configuration
# Input.Column0.Start
# Input.Column0.End
# ...
# Input.ColumnN.Start
# Input.ColumnN.End
################################
#
# Input.ColumnN.Start
#
# Specifies zero-based position where the value starts. The character in the
# specified position is included in the column value. This value must be
# specified for every column in the input file which has to be processed
# or passed-through and included in the output file.
#
# Note: this parameter is ignored if Input.Type is set to Delimited
#
Input.Column0.Start =
#
# Input.ColumnN.End
#
# Specifies zero-based position where the value ends. The character in the
# specified position is included in the column value. This value must be
# specified for every column in the input file which has to be processed
# or passed-through and included in the output file.
#
# Note: this parameter is ignored if Input.Type is set to Delimited
#
Input.Column0.End =
###########################################
# DeTokenization Configuration
# DeTokenizer.Column0.TokenVault
# DeTokenizer.Column0.TokenFormat
# ...
# DeTokenizer.ColumnN.TokenVault
# DeTokenizer.ColumnN.TokenFormat
############################################
#
# DeTokenizer.ColumnN.TokenVault
#
# Specifies the name of the token vault wherein plain text corresponding to tokens
# of this column is stored. If the column does not need to be DeTokenized, do not
# specify this parameter. If this parameter is specified, all other DeTokenization
# parameters for the same column must also be specified. Token vault
# specified in this parameter must exist before running bulk migration.
#
DeTokenizer.Column5.TokenVault = BTM
DeTokenizer.Column6.TokenVault = BTM_2
#
# DeTokenizer.ColumnN.TokenFormat
#
# Specifies token format that will be used to DeTokenize this column. If the
# column does not need to be DeTokenized, do not specify this parameter. If
# this parameter is specified, all other tokenization parameters for the
# same column must also be specified.
#
# Valid values
# <number>
# 0 for plain Text
# 6 for Masked Token
#
DeTokenizer.Column5.TokenFormat = 0
DeTokenizer.Column6.TokenFormat = 0
######################
# Output Configuration
# Output.FilePath
# Output.Sequence
######################
#
# Output.FilePath
#
# Specifies full path to the output file
#
Output.FilePath = C:\\Desktop\\migration\\output.csv
#
# Intermediate.FilePath
#
# Specifies the file path where the intermediate temporary chunks of outputs
# are stored.
#
# Note: If no intermediate file path is set, then the path specified in
# Output.FilePath is used as the intermediate file path.
#
Intermediate.FilePath =
#
# Output.Sequence
#
# Specifies sequence of the input columns in which they are written to the
# output file. Each column in the input file that has to appear in the
# output file has to have its column index specified in the output sequence.
# For each column in the input file, the sequence number can be either positive
# or negative. Positive sequence number indicates that the decrypted and/or
# tokenized value is written to the output file, if the column was decrypted
# and/or tokenized. Negative sequence number indicates that the original value
# from the input file is written to the output file. For columns that are not
# decrypted and not tokenized (pass-through columns) specifying positive or
# negative number has the same effect.
# Column indexes are separated by , character.
#
Output.Sequence = 0,-1,-2,-3,-4,5,6
###############################
# Multi-threading Configuration
# Threads.BatchSize
# Threads.CryptoThreads
# Threads.TokenThreads
# Threads.PollTimeout
###############################
#
# Threads.BatchSize
#
# Specifies number of rows per batch.
#
Threads.BatchSize = 10000
#
# Threads.FormatterThreads
#
# Specifies number of threads that will format of columns from input file
# as required.
#
Threads.FormatterThreads = 5
#
# Threads.DetokenThreads
#
# Specifies number of threads that will perform detokenization of columns
# as required.
#
Threads.DetokenThreads = 2
#
# Threads.PollTimeout
#
# Specifies the amount of time (in milliseconds) processing threads will
# wait for a batch on the data queue before timing out, checking for
# adminitrative commands on the management queue, and then checking for
# another batch on the data queue.
# Default value of this parameter is 100.
# Do not modify this parameter unless instructed by customer support.
#
Threads.PollTimeout = 100
#
# Logger.LogLevel
#
# Specifies the level of details displayed
#
# Valid values
# Normal
# Verboses
#
Logger.LogLevel = Normal
#
# Obfucate password
#
# Specifies if the provided passwords are obfuscated or not
#
# Valid values
# true
# false
#
PasswordObfuscation = false
#
# Credential obfuscation
#
# If true, utility accepts KeyManager and database credentials obfuscated with obfuscator utility
#
# Valid values
# true
# false
# Note: Default value is set to false.
#
CredentialObfuscation = false
#
# TokenSeparator
#
# Specifies if the output values are space separated or not.
#
# Note: This parameter is ignored if Input.Type is set to Delimited.
# Valid values
# true
# false
# Note: Default value is set to true.
#
TokenSeparator = true
#
# StreamInputData
#
# Specifies if the input data is streamed or not.
#
# Valid values
# true
# false
# Note: Default value is set to false.
#
StreamInputData = false
Note: If StreamInputData is set to true, the TokenSeparator parameter is not considered.
#
# CodePageUsed
#
# Specifies the code page in use.
# Used with EBCDIC character set for ex. use "ibm500" for EBCDIC International
# https://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html
#
CodePageUsed =
Note: If no value is specified, by default, ASCII character set is used.
#
# FailureThreshold
#
# Specifies the number of errors after which the Bulk Utility aborts the
# detokenization operation.
# Valide values
# -1 = Detokenization continues irrespective of number of errors during the
# operation. This is the default value.
# 0 = Bulk utility aborts the operation on occurence of any error.
# Any positive value = Indicates the failure threshold, after which the Bulk
# Utility aborts the operation.
#
# Note: If no value or a negative value is specified, Bulk Utility will continue
# irrespective of number of errors.
#
FailureThreshold = -1
###############################################################################
# END
###############################################################################
Properties Settings
The Input.Type
refers to the way in which columns of input data are separated and identified for processing purposes. The Input.Type
must be set to either Delimited or Positional. The individual pieces of data in the input data file can be separated by delimiters (e.g., by commas) or set up according to a positional format. Processing does not allow for any formatting mistakes; there is no recovery or line-exception-reject processing.
Note
In case of stream data, positional format is to be used for Input.Type
file, and the new row starts after the position specified in the Input.ColumnN.End
parameter.
If you use a delimiter-separated-value file, then you can use escape characters to handle instances of special characters that would otherwise impact processing. The utility enables you to uniquely identify each column of data so that, if the column needs decryption, you can specify the key name, decryption algorithm (AES/CBC/PKCS5Padding), and decryption encoding (Base16, Base64).
The utility similarly enables you specify, for each column:
TokenVault
- The name of the token vault that will be used for tokenization or detokenization.CustomDataColumnIndex
- An index number assigned to a column in the input file. This index number has to appear in the output file, in the output sequence, if the column is to be used for output. TheOutput.Sequence
makes use of theCustomDataColumnIndex
.TokenFormat
- The token formats available for use are listed in themigration.properties
anddetokenization.properties
file for tokenization and detokenization respectively.LuhnCheck
- This is the value that is passed to TokenService insert method. Depending on the format of the value (e.g., FIRST_SIX_LAST_FOUR_FAIL_LUHN_TOKEN or EMAIL_ADDRESS_TOKEN), you can use this to define whether the token should pass luhn check, fail luhn check, or if it doesn't matter. When the luhn check matters, this input value (true or false) determines whether the luhn check logic specified by the format is enforced or ignored.
The following Multi-threading Configuration values can be adjusted to achieve performance improvements:
Threads.BatchSize
- Use this to set the number of rows in your batch. This controls how the input file is split in batches by controlling the number of rows in a batch. Depending on your system, you may need to adjust this value to achieve optimal performance. Recommended value is 10,000.Threads.CryptoThreads
- Use this to set the number of threads that will perform decryption of columns. This can be adjusted to change the ratio of Crypto to Tokenization threads, and to decrease the time for processing of the batch size, to achieve optimal performance.Threads.TokenThreads
- Use this to set the number of threads that will perform tokenization of columns. As with the crypto thread number, this can be adjusted for performance optimization.Threads.PollTimeout
- The amount of time (in milliseconds) processing threads will wait for a batch on the data queue before timing out, looking for more work in the queue, and checking for another batch. Leave this set at the default of 100 milliseconds. Do not change this parameter unless instructed to do so by customer support.
Predefined Token Formats
The following predefined token formats are available that can be used in the Bulk Utility:
RANDOM_TOKEN
RANDOM_ALPHANUMERIC_TOKEN
SEQUENTIAL_TOKEN (can be used with sequential vault only)
LAST_FOUR_TOKEN
FIRST_SIX_TOKEN
FIRST_TWO_LAST_FOUR_TOKEN
FIRST_SIX_LAST_FOUR_TOKEN
FIXED_NINETEEN_TOKEN
FIXED_TWENTY_LAST_FOUR_TOKEN
EMAIL_ADDRESS_TOKEN
DATE_MMDDYYYY_TOKEN
DATE_DDMMYYYY_TOKEN
DATE_YYYYMMDD_TOKEN
FIRST_SIX_LAST_FOUR_FAIL_LUHN_TOKEN
FIXED_FIRST_TWO_LAST_FOUR_FAIL_LUHN_TOKEN
SHA2_256_BASE16_TOKEN
SHA2_384_BASE16_TOKEN
SHA2_512_BASE16_TOKEN
SHA2_256_ BASE64_TOKEN
SHA2_384_ BASE64_TOKEN
SHA2_512_ BASE64_TOKEN
ALPHANUMERIC_TOKEN