Tokenization for Delimited Type Input File Using Migration.properties File

In this sample, using CT-V Bulk Utility, a delimited type data file will be tokenized.

Creating the Input Data File

Below is the data that will be used to populate the customerTable.csv file:

Setting Parameters for Migration.properties File

Below is the parameters set for delimited format input data file:

#####################
# Input Configuration
# Input.FilePath
# Input.Type
#####################
#
Input.FilePath = C:\\Desktop\\migration\\customerTable.csv
#
#
Input.Type = Delimited
###############################
# Delimited Input Configuration
# Input.EscapeCharacter
# Input.QuoteCharacter
# Input.ColumnDelimiter
###############################
#
Input.EscapeCharacter = \\
#
Input.QuoteCharacter = "
#
Input.ColumnDelimiter = ,
###########################################
# Tokenization Configuration
# Tokenizer.Column0.TokenVault
# Tokenizer.Column0.CustomDataColumnIndex
# Tokenizer.Column0.TokenFormat
# Tokenizer.Column0.LuhnCheck
# ...
# Tokenizer.ColumnN.TokenVault
# Tokenizer.ColumnN.CustomDataColumnIndex
# Tokenizer.ColumnN.TokenFormat
# Tokenizer.ColumnN.LuhnCheck
############################################
#
Tokenizer.Column5.TokenVault = BTM
Tokenizer.Column6.TokenVault = BTM_2
#
Tokenizer.Column5.CustomDataColumnIndex = -1
Tokenizer.Column6.CustomDataColumnIndex = -1
#
Tokenizer.Column5.TokenFormat = LAST_FOUR_TOKEN
Tokenizer.Column6.TokenFormat = EMAIL_ADDRESS_TOKEN
#
Tokenizer.Column5.LuhnCheck = true
Tokenizer.Column6.LuhnCheck = false
#
######################
# Output Configuration
# Output.FilePath
# Output.Sequence
######################
#
Output.FilePath = C:\\Desktop\\migration\\tokenized.csv
# Specifies the file path where the intermediate temporary chunks of
# outputs are stored.
#
# Note: If no intermediate file path is set, then the path specified in
# Output.FilePath is used as the intermediate file path.
#
Intermediate.FilePath = C:\\Desktop\\migration\\tokenized.csv
#
# Set positive value for columns to be tokenized. For example column 5 and 6 have
# been set positive below, so now only these two columns will be tokenized.
Output.Sequence = 0,-1,-2,-3,-4,5,6
# TokenSeparator
#
# Specifies if the tokens are space separated or not.
# Note: This parameter is ignored if Input.Type is set to Delimited.
#
# Valid values
# true
# false
#
# Note: The default value is set to true.
#
TokenSeparator = true
#
# StreamInputData
#
# Specifies whether the input data is streamed or not.
#
# Valid values
# true
# false
# Note: Default value is set to false.
# 
StreamInputData = false
Note: Applicable for only positional type input file. If StreamInputData is set to true, the
TokenSeparator parameter is not considered.
#
# CodePageUsed
#
# Specifies the code page in use.
# Used with EBCDIC character set for ex. use "ibm500" for EBCDIC International
# https://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html
# 
CodePageUsed =
Note: If no value is specified, by default, ASCII character set is used.
#
# FailureThreshold
#
# Specifies the number of errors after which the Bulk Utility aborts the
# tokenization operation.
# Valide values
# -1 = Tokenization continues irrespective of number of errors during the
# operation. This is the default value.
# 0 = Bulk utility aborts the operation on occurrence of any error.
# Any positive value = Indicates the failure threshold, after which the Bulk
# Utility aborts the operation.
#
# Note: If no value or a negative value is specified, Bulk Utility will continue
# irrespective of number of errors.
# 
FailureThreshold = -1
###############################################################################
# END
###############################################################################

Running CT-V Bulk Utility

Enter the following command to tokenize with CT-V Bulk Utility in a Windows environment:

java -cp SafeNetTokenService-8.12.3.000.jar com.safenet.token.migration.main migration.properties –t DSU=NAE_User1 DSP=qwerty12345 DBU=DB_User1 DBP=abcd1234

Reviewing the Output File

The output data file is saved at the same path specified in the migration.properties file with the same name tokenized.csv.

Here is the data from the output file:

Only column 5 and 6 have been tokenized as per the Output.Sequence mentioned in the migration.properties file above. If the user wants to get output in a different sequence than input file, it is required to change the Output.Sequence parameter in migration.properties file.

For example, if you change the output sequence in the migration.properties file as Output.Sequence = 0,-2,-1,-3,-4,5,6, the output file will be as shown below:

Suggest A Change