Tokenization for Delimited Type Input File Using Migration.properties File
In this sample, using CT-V Bulk Utility, a delimited type data file will be tokenized.
Creating the Input Data File
Below is the data that will be used to populate the customerTable.csv
file:
Setting Parameters for Migration.properties File
Below is the parameters set for delimited format input data file:
#####################
# Input Configuration
# Input.FilePath
# Input.Type
#####################
#
Input.FilePath = C:\\Desktop\\migration\\customerTable.csv
#
#
Input.Type = Delimited
###############################
# Delimited Input Configuration
# Input.EscapeCharacter
# Input.QuoteCharacter
# Input.ColumnDelimiter
###############################
#
Input.EscapeCharacter = \\
#
Input.QuoteCharacter = "
#
Input.ColumnDelimiter = ,
###########################################
# Tokenization Configuration
# Tokenizer.Column0.TokenVault
# Tokenizer.Column0.CustomDataColumnIndex
# Tokenizer.Column0.TokenFormat
# Tokenizer.Column0.LuhnCheck
# ...
# Tokenizer.ColumnN.TokenVault
# Tokenizer.ColumnN.CustomDataColumnIndex
# Tokenizer.ColumnN.TokenFormat
# Tokenizer.ColumnN.LuhnCheck
############################################
#
Tokenizer.Column5.TokenVault = BTM
Tokenizer.Column6.TokenVault = BTM_2
#
Tokenizer.Column5.CustomDataColumnIndex = -1
Tokenizer.Column6.CustomDataColumnIndex = -1
#
Tokenizer.Column5.TokenFormat = LAST_FOUR_TOKEN
Tokenizer.Column6.TokenFormat = EMAIL_ADDRESS_TOKEN
#
Tokenizer.Column5.LuhnCheck = true
Tokenizer.Column6.LuhnCheck = false
#
######################
# Output Configuration
# Output.FilePath
# Output.Sequence
######################
#
Output.FilePath = C:\\Desktop\\migration\\tokenized.csv
# Specifies the file path where the intermediate temporary chunks of
# outputs are stored.
#
# Note: If no intermediate file path is set, then the path specified in
# Output.FilePath is used as the intermediate file path.
#
Intermediate.FilePath = C:\\Desktop\\migration\\tokenized.csv
#
# Set positive value for columns to be tokenized. For example column 5 and 6 have
# been set positive below, so now only these two columns will be tokenized.
Output.Sequence = 0,-1,-2,-3,-4,5,6
# TokenSeparator
#
# Specifies if the tokens are space separated or not.
# Note: This parameter is ignored if Input.Type is set to Delimited.
#
# Valid values
# true
# false
#
# Note: The default value is set to true.
#
TokenSeparator = true
#
# StreamInputData
#
# Specifies whether the input data is streamed or not.
#
# Valid values
# true
# false
# Note: Default value is set to false.
#
StreamInputData = false
Note: Applicable for only positional type input file. If StreamInputData is set to true, the
TokenSeparator parameter is not considered.
#
# CodePageUsed
#
# Specifies the code page in use.
# Used with EBCDIC character set for ex. use "ibm500" for EBCDIC International
# https://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html
#
CodePageUsed =
Note: If no value is specified, by default, ASCII character set is used.
#
# FailureThreshold
#
# Specifies the number of errors after which the Bulk Utility aborts the
# tokenization operation.
# Valide values
# -1 = Tokenization continues irrespective of number of errors during the
# operation. This is the default value.
# 0 = Bulk utility aborts the operation on occurrence of any error.
# Any positive value = Indicates the failure threshold, after which the Bulk
# Utility aborts the operation.
#
# Note: If no value or a negative value is specified, Bulk Utility will continue
# irrespective of number of errors.
#
FailureThreshold = -1
###############################################################################
# END
###############################################################################
Running CT-V Bulk Utility
Enter the following command to tokenize with CT-V Bulk Utility in a Windows environment:
java -cp SafeNetTokenService-8.12.4.000.jar com.safenet.token.migration.main migration.properties –t DSU=NAE_User1 DSP=qwerty12345 DBU=DB_User1 DBP=abcd1234
Reviewing the Output File
The output data file is saved at the same path specified in the migration.properties
file with the same name tokenized.csv
.
Here is the data from the output file:
Only column 5 and 6 have been tokenized as per the Output.Sequence
mentioned in the migration.properties
file above. If the user wants to get output in a different sequence than input file, it is required to change the Output.Sequence
parameter in migration.properties
file.
For example, if you change the output sequence in the migration.properties
file as Output.Sequence = 0,-2,-1,-3,-4,5,6
, the output file will be as shown below: