Bulk Migration Parameters
This article describes the input, output, tokenization, and detokenization parameters of the migration.properties
and detokenization.properties
files.
Input configuration
The input configuration section lists and defines the parameters required to define the input data file and the columns from input data file that needs to be tokenized (migration.properties
file) or detokenized (detokenization.properties
file). For example the name and path for the input data file is passed to the bulk utility through the Input File Path and Input Type parameters in the Input Configuration section of migration.properties
/detokenization.properties
file.
The following is a sample of the Input Configuration section of migration.properties
and detokenization.properties
files:
#####################
# Input Configuration
# Input.FilePath
# Input.Type
#####################
#
# Input.FilePath
#
# Full path to the input file
#
Input.FilePath = C://Desktop//migration//customerTable.csv
#
# Input.Type
#
# Format of the input file
#
# Valid values
# Delimited
# Positional
#
Input.Type = Delimited
###############################
# Delimited Input Configuration
# Input.EscapeCharacter
# Input.QuoteCharacter
# Input.ColumnDelimiter
###############################
# Input.EscapeCharacter
#
# Specifies a character that is used to 'escape' special characters that
# alter input processing
#
# Note: this parameter is ignored if Input.Type is set to Positional
#
Input.EscapeCharacter = \\
#
# Input.QuoteCharacter
#
# Specifies a character that is used around character sequences that contain
# delimiter characters and are to be treated as a single column value
#
# Note: this parameter is ignored if Input.Type is set to Positional
#
Input.QuoteCharacter = "
#
# Input.ColumnDelimiter
#
# Specifies a character that separates columns in the input file
#
# Note: this parameter is ignored if Input.Type is set to Positional
#
Input.ColumnDelimiter = ,
################################
# Positional Input Configuration
# Input.Column0.Start
# Input.Column0.End
# ...
# Input.ColumnN.Start
# Input.ColumnN.End
################################
#
# Input.ColumnN.Start
#
# Specifies zero-based position where the value starts. The character in the
# specified position is included in the column value. This value must be
# specified for every column in the input file which has to be processed
# or passed-through and included in the output file.
#
# Note: this parameter is ignored if Input.Type is set to Delimited
#
Input.Column0.Start =
#
# Input.ColumnN.End
#
# Specifies zero-based position where the value ends. The character in the
# specified position is included in the column value. This value must be
# specified for every column in the input file which has to be processed
# or passed-through and included in the output file.
#
# Note: this parameter is ignored if Input.Type is set to Delimited
#
Input.Column0.End =
Tokenization configurations
The TokenSpec Configuration and AlgoSpec Configuration are sections of the migration.properties
file. It contains all the necessary information required for the tokenization of input data. For example, token format that will be used to tokenize a column and whether the generated token will pass or fail luhn check.
The following sample shows the configuration of AlgoSpec and TokenSpec:
###########################################
# AlgoSpec Configuration
# Version
# Unicode
############################################
#
# AlgoSpec.Version
# Specifies the version for AlgoSpec
#
# Valid values
# 0
# 1
# Note: 0 is used for backward compatibility with SafeNet Vaultless
# Tokenization version 8.5.0. Use the default value 1 to have higher security for
# the token generation.
AlgoSpec.Version = 1
# AlgoSpec.Unicode
# Specifies the Unicode Block to be used for bulk tokenization
# Note: Refer Tokenization and Detokenization of Unicode Blocks using AlgoSpec.
AlgoSpec.Unicode =
###########################################
# TokenSpec Configuration
# Tokenizer.Column0.GroupId
# Tokenizer.Column0.TokenFormat
# Tokenizer.Column0.ClearTextSensitive
# Tokenizer.Column0.LuhnCheck
# Tokenizer.Column0.NonIdempotentTokens
# ...
# Tokenizer.ColumnN.GroupId
# Tokenizer.ColumnN.TokenFormat
# Tokenizer.ColumnN.ClearTextSensitive
# Tokenizer.ColumnN.LuhnCheck
# Tokenizer.Column0.NonIdempotentTokens
############################################
#
# Tokenizer.ColumnN.GroupId
#
# Specifies the GroupId for this column. Different tokens will be generated for
# the same plaintext having different group IDs.
# Default value : 0
Tokenizer.Column0.GroupId = 0
#
# Tokenizer.ColumnN.Token.TokenFormat
#
# Specifies token format that will be used to tokenize this column.
# If not specified, then no tokenization will be performed for this column.
# Note: Refer “TokenSpec” on page 28 for the applicable token formats.
#
Tokenizer.Column0.TokenFormat = TOKEN_ALL
#
# Tokenizer.ColumnN.ClearTextSensitive
#
# Specifies whether the token is clearTextSensitive or not.
# It is valid only if token format is other than TOKEN_ALL.
# Default value : false
# Recommended value : true
#
Tokenizer.Column0.ClearTextSensitive = true
#
# Tokenizer.ColumnN.LuhnCheck
#
# Specifies whether the token generated will pass or fail luhn check.
# If no value is provided, then luhn check for the token will be ignored.
# If provided, make sure to provide input which is luhn compliant.
#
# Valid values
# true
# false
#
Tokenizer.Column0.LuhnCheck =
#
# Tokenizer.ColumnN.NonIdempotentTokens
#
# Specifies if NonIdempotentTokens to be generated.
# It is applicable only when LuhnCheck is set to false.
# Valid values
# true
# false
Tokenizer.Column0.NonIdempotentTokens =
Decryption configuration
For tokenization of encrypted text the Decryption Configuration section in migration.properties file needs to be configured as mentioned here:
###############################
# Decryption Configuration
# Decryptor.Column0.Key
# Decryptor.Column0.Algorithm
# Decryptor.Column0.Encoding
# ...
# Decryptor.ColumnN.Key
# Decryptor.ColumnN.Algorithm
# Decryptor.ColumnN.Encoding
###############################
#
# Decryptor.ColumnN.Key
#
# Specifies key name for a column to be decrypted. If a column does not need to
# be decrypted, do not specify this parameter. If this parameter is specified,
# all other decryption parameters for the same column must also be specified.
#
Decryptor.Column0.Key = token_key
#
# Decryptor.ColumnN.Algorithm
#
# Specifies decryption algorithm for a column to be decrypted. If a column
# does not need to be decrypted, do not specify this parameter. If this
# parameter is specified, all other decryption parameters for the same column
# must also be specified.
#
# Valid values
# AES/CBC/PKCS5Padding
#
Decryptor.Column0.Algorithm =AES/CBC/PKCS5Padding
#
# Decryptor.ColumnN.Encoding
#
# Specifies encoding for a column to be decrypted. If a column does not need to
# be decrypted, do not specify this parameter. If this parameter is specified,
# all other decryption parameters for the same column must also be specified.
#
# Valid values
# Base16
# Base64
#
Decryptor.Column0.Encoding = Base16
Note
If the above mentioned parameters are not being used, they must be left blank.
Detokenization configuration
The TokenSpec Configuration and AlgoSpec Configuration are sections in detokenization.properties
file. It contains all the necessary information required for the bulk detokenization of tokenized input data. For example, the token format that will be used to detokenize a column. The following is a sample of the AlgoSpec Configuration and TokenSpec Configuration sections of detokenization.propertie
s file:
###########################################
# AlgoSpec Configuration
# Version
# Unicode
############################################
#
# AlgoSpec.Version
# Specifies the version for AlgoSpec
#
# Valid values
# 0
# 1
# Note: 0 is used for backward compatibility with SafeNet Vaultless
# Tokenization version 8.5.0. The default value is set to 1
AlgoSpec.Version = 1
# AlgoSpec.Unicode
# Specifies the Unicode Block to be used for bulk detokenization
# Specify the Unicode block that was used during bulk tokenization.
# Note: Refer to Tokenization and Detokenization of Unicode Blocks using AlgoSpec.
AlgoSpec.Unicode =
###########################################
###########################################
# TokenSpec Configuration
# DeTokenizer.Column0.GroupId
# DeTokenizer.Column0.TokenFormat
# DeTokenizer.Column0.ClearTextSensitive
# DeTokenizer.Column0.LuhnCheck
# DeTokenizer.Column0.NonIdempotentTokens
# ...
# DeTokenizer.ColumnN.GroupId
# DeTokenizer.ColumnN.TokenFormat
# DeTokenizer.ColumnN.ClearTextSensitive
# DeTokenizer.ColumnN.LuhnCheck
# DeTokenizer.ColumnN.NonIdempotentTokens
############################################
Note: Specify the same TokenSpec parameters for bulk detokenization as used during bulk tokenization of the plaintext.
# DeTokenizer.ColumnN.GroupId
#
# Specifies the GroupId for this column.
# Default value : 0
#
DeTokenizer.Column0.GroupId = 0
#
# DeTokenizer.ColumnN.Token.TokenFormat
#
# Specifies token format that will be used to detokenize this column.
# It is required to provide value for this parameter.
# If not specified, then no detokenization will be performed for this column.
#
DeTokenizer.Column0.TokenFormat= TOKEN_ALL
#
# DeTokenizer.ColumnN.ClearTextSensitive
#
# Specifies whether the token being detokenized is clearTextSensitive or not.
# It is valid only if token format is other than TOKEN_ALL.
# Default value : false
# Recommended value: true
#
DeTokenizer.Column0.ClearTextSensitive= true
#
# DeTokenizer.ColumnN.LuhnCheck
#
# Specifies whether the token being detokenized will pass or fail luhn check.
# If no value is provided, then luhn check for the token will be ignored.
# If provided, make sure to provide input which is luhn compliant.
#
# Valid values
# true
# false
#
DeTokenizer.Column0.LuhnCheck =
#
# DeTokenizer.ColumnN.NonIdempotentTokens
#
# Specifies if token being detokenized is non-idempotent token or not.
# It is applicable only when LuhnCheck is set to false.
#
DeTokenizer.Column0.NonIdempotentTokens=
Output configuration
The output configuration is a section in migration.properties/detokenization.properties files. It contains all the necessary information required for creation of output data file. For example the output data file name and path or the destination database details is passed to the bulk utility as the parameter in output configuration section of properties file.The configuration section must contain the correct information, in the correct order.
The Output.Sequence parameter in the migration.properties/detokenization.properties file specifies the sequence, of the input columns and tokenized columns, in which they are written to the output file. If a column of input file has to appear in the output file then its column index needs to be specified in the output sequence. The sequence number can either be positive or negative. Positive sequence number indicates that the decrypted and/or tokenized value for a column is written to the output file, if the column was decrypted and/or tokenized. While negative sequence number indicates that the original value for a column, from the input file is written to the output file. For any columns that are not decrypted and/or tokenized the specified sequence number has the same effect.
The following is a sample of the Output Configuration section of migration.properties
and detokenization.properties
files:
######################
# Output Configuration
# Output.FilePath
# Output.Sequence
######################
#
# Output.FilePath
#
# Specifies full path to the output file
#
Output.FilePath = C://Desktop//migration//tokenized.csv
#
# Intermediate.FilePath
#
# Specifies the intermediate file path where the intermediate temporary chunks of
# outputs are stored.
#
# Note: If no intermediate file path is set, then the path specified in
# Output.FilePath is used as the intermediate file path.
#
Intermediate.FilePath =
#
# Output.Sequence
#
# Specifies sequence of the input columns in which they are written to the
# output file. Each column in the input file that has to appear in the
# output file has to have its column index specified in the output sequence.
# For each column in the input file, the sequence number can be either positive
# or negative. Positive sequence number indicates that the decrypted and/or
# tokenized value is written to the output file, if the column was decrypted
# and/or tokenized. Negative sequence number indicates that the original value
# from the input file is written to the output file. For columns that are not
# decrypted and not tokenized (pass-through columns) specifying positive or
# negative number has the same effect.
# Column indexes are separated by , character.
#
Output.Sequence = 0,-1,-2,-3,-4,5,6