Configuring Tokenization Parameters
This section describes the Tokenization Configuration parameters and covers the following topics:
Tokenization Configuration Overview
The tokenization configuration is a section in migration.properties
and masking.properties
files. It contains all the necessary information required for the tokenization of input data. For example, the name of the token vault that will receive the tokens for a column, token format that will be used to tokenize a column and whether the generated token will pass or fail luhn check.
A hidden parameter PreserveSpace
has been introduced for File-to-File migration. If this parameter is enabled, the spaces in the plaintext value during tokenization are preserved. This parameter is not applicable for any of the flavors of bulk masking.
Note
To preserve spaces in the generated tokens during File-to-File migration, set the PreserveSpace=true
in the migration.properties
file.
If PreserveSpace
parameter is enabled, spaces are preserved in the following scenarios:
If the plaintext value contains spaces at leading or trailing positions, the spaces are preserved in the generated tokens.
For example:
Plaintext Value Tokens ----356789------ ----184326------ If the plaintext value only contains spaces, the spaces are preserved in the generated tokens.
For example:
Plaintext Value Tokens ---------------- ---------------- Note
In the above examples, - represents space.
If the plaintext value is empty and the
PreserveSpace
parameter is enabled, the "Insert values cannot be empty" message is not displayed on the console.
Below is a sample of the Tokenization Configuration section of migration.properties
file:
###########################################
# Tokenization Configuration
# Tokenizer.Column0.TokenVault
# Tokenizer.Column0.CustomDataColumnIndex
# Tokenizer.Column0.TokenFormat
# Tokenizer.Column0.LuhnCheck
# ...
# Tokenizer.ColumnN.TokenVault
# Tokenizer.ColumnN.CustomDataColumnIndex
# Tokenizer.ColumnN.TokenFormat
# Tokenizer.ColumnN.LuhnCheck
############################################
#
# Tokenizer.ColumnN.TokenVault
#
# Specifies the name of the token vault that will receive the tokens for
# this column. If the column does not need to be tokenized, do not specify
# this parameter. If this parameter is specified, all other tokenization
# parameters for the same column must also be specified. Token vault
# specified in this parameter must exist before running bulk migration.
#
Tokenizer.Column5.TokenVault = BTM
Tokenizer.Column6.TokenVault = BTM_2
#
# Tokenizer.ColumnN.CustomDataColumnIndex
#
# Specifies zero-based index of the column in the input file that contains
# custom data values for this column. If the column does not need to be
# tokenized, do not specify this parameter. If this parameter is specified,
# all other tokenization parameters for the same column must be specified.
#
Tokenizer.Column5.CustomDataColumnIndex = -1
Tokenizer.Column6.CustomDataColumnIndex = -1
#
# Tokenizer.ColumnN.TokenPropertyColumnIndex
#
# Specifies zero-based index of the column in the input file that contains
# Token Property values for this column. If the column does not need to be
# tokenized, do not specify this parameter. If this parameter is specified,
# all other tokenization parameters for the same column must be specified.
#
Tokenizer.Column5.TokenPropertyColumnIndex = 2
#
# Tokenizer.ColumnN.CustomTokenPropertyColumnIndex
#
# Specifies zero-based index of the column in the input file that contains
# Custom Token Property values for this column. If the column does not need to be
# tokenized, do not specify this parameter. If this parameter is specified,
# all other tokenization parameters for the same column must be specified.
#
Tokenizer.Column5.CustomTokenPropertyColumnIndex = 3
#
# Tokenizer.ColumnN.TokenFormat
#
# Specifies token format that will be used to tokenize this column. If the
# column does not need to be tokenized, do not specify this parameter. If
# this parameter is specified, all other tokenization parameters for the
# same column must also be specified.
# Token format may be specified using format name for built-in formats,
# or format identifier for built-in and custom formats. It is recommended
# to use format names for built-in formats, as documented in the CipherTrust Vaulted Tokenization
# User Guide. For custom formats, use format identifier returned by
# TokenService.createNewFormat API.
#
# Valid values
# <string>
# <number>
#
# <string> - is the name of a built-in token format, such as LAST_FOUR_TOKEN
# For complete list of supported token formats, refer to CipherTrust Vaulted Tokenization
# User Guide.
#
# <number> - is the format identifier (between 101 and 999) returned by
# TokenService.createNewFormat API.
#
Tokenizer.Column5.TokenFormat = LAST_FOUR_TOKEN
Tokenizer.Column6.TokenFormat = EMAIL_ADDRESS_TOKEN
#
# Tokenizer.ColumnN.LuhnCheck
#
# Specifies whether the generated token will pass or fail luhn check. If the
# column does not need to be tokenized, don't specify this parameter. If
# this parameter is specified, all other tokenization parameters for the
# same column must be specified.
#
# Valid values
# true
# false
#
Tokenizer.Column5.LuhnCheck = true
Tokenizer.Column6.LuhnCheck = false
Below is a sample of the Tokenization Configuration section of masking.properties
file (for File-to-File tokenization):
###########################################
# Masking Configuration for File-to-File
# Mask.Column0.TokenFormat
# Mask.Column0.StartToken
# Mask.Column0.LuhnCheck
# Mask.Column0.InputDataLength
# ...
# Mask.ColumnN.TokenFormat
# Mask.ColumnN.StartToken
# Mask.ColumnN.LuhnCheck
# Mask.ColumnN.InputDataLength
# [Mandatory]
############################################
#
# Mask.ColumnN.TokenFormat
#
# Specifies token format that will be used to mask this column. If the
# column does not need to be masked, do not specify this parameter. If
# this parameter is specified, all other masking parameters for the
# same column must also be specified.
# Token format may be specified using format name for built-in formats,
# or format identifier for built-in formats. It is recommended
# to use format names for built-in formats, as documented in the CipherTrust Vaulted Tokenization
# User Guide.
#
# Valid values
# <string>
# <number>
#
# <string> - is the name of a built-in token format, such as LAST_FOUR_TOKEN
# For complete list of supported token formats, refer to CipherTrust Vaulted Tokenization
# User Guide.
#
# <number> - is the format Number
Mask.Column5.TokenFormat = RANDOM_TOKEN
Mask.Column6.TokenFormat = EMAIL_ADDRESS_TOKEN
#
# Mask.ColumnN.LuhnCheck
#
# Specifies whether the generated token will pass or fail luhn check. If the
# column does not need to be masked, don't specify this parameter.
# Valid values
# true
# false
Mask.Column5.LuhnCheck = false
Mask.Column6.LuhnCheck = false
#
# Mask.ColumnN.StartToken
#
# Specifies the start token when the TokenFormat is SEQUENTIAL_TOKEN.
Mask.Column0.StartToken =
#
# Mask.ColumnN.InputDataLength
#
# Specifies the input data length required for the sequential tokens
# i.e. when TokenFormat is SEQUENTIAL_TOKEN
# For SEQUENTIAL_TOKEN format either Mask.ColumnN.StartToken or Mask.ColumnN.InputDataLength should be specified
# Incase both are specified, Mask.ColumnN.StartToken will be selected
Mask.Column0.InputDataLength =
Below is a sample of the Tokenization Configuration section of masking.properties
file (for DB-to-DB tokenization):
#Masking Configuration for DB-to-DB
#
#SourceTable.<TableName1>=[column1,format,luhnCheck,startToken,inputDataLength]:
#[column2,format,luhnCheck,startToken,inputDataLength]......
#[columnN,format,luhnCheck,startToken,inputDataLength]
# ..........
# ..........
#SourceTable.<TableNameN>=[column1,format,luhnCheck,startToken,inputDataLength]:
#[column2,format,luhnCheck,startToken,inputDataLength]......
#[columnN,format,luhnCheck,startToken,inputDataLength]
#
#For SEQUENTIAL_TOKEN format either startToken or inputDataLength should be
# specified. In case both are specified, startToken will be used.
# inputDataLength specifies the input data length required for the
# sequential tokens.
#
############################################
Note: If any table specified here has parent-child foreign key relationship, then data of all the
associated tables (apart from the tables menionted in this parameter) will get copied to the
destination database tables.
SourceTable.<customer>=[CC_NUMBER,LAST_FOUR_TOKEN,true]:[EMAIL_ID,EMAIL_ADDRESS_ TOKEN,false]
# NonMaskingTables
# [optional]
# Specifies the list of tables which are not to be masked. The table names are
# comma spearated.
NonMaskingTables=Table4
Decryption Configuration Overview
For tokenization of encrypted text, the Decryption Configuration section in migration.properties
file needs to be configured as mentioned below:
###############################
# Decryption Configuration
# Decryptor.Column0.Key
# Decryptor.Column0.Algorithm
# Decryptor.Column0.Encoding
# ...
# Decryptor.ColumnN.Key
# Decryptor.ColumnN.Algorithm
# Decryptor.ColumnN.Encoding
###############################
#
# Decryptor.ColumnN.Key
#
# Specifies key name for a column to be decrypted. If a column does not need to
# be decrypted, do not specify this parameter. If this parameter is specified,
# all other decryption parameters for the same column must also be specified.
#
Decryptor.Column0.Key = token_key
#
# Decryptor.ColumnN.Algorithm
#
# Specifies decryption algorithm for a column to be decrypted. If a column
# does not need to be decrypted, do not specify this parameter. If this
# parameter is specified, all other decryption parameters for the same column
# must also be specified.
#
# Valid values
# AES/CBC/PKCS5Padding
#
Decryptor.Column0.Algorithm = AES/CBC/PKCS5Padding
#
# Decryptor.ColumnN.Encoding
#
# Specifies encoding for a column to be decrypted. If a column does not need to
# be decrypted, do not specify this parameter. If this parameter is specified,
# all other decryption parameters for the same column must also be specified.
#
# Valid values
# Base16
# Base64
#
Decryptor.Column0.Encoding = Base16
Note
If the above mentioned parameters are not being used, they must be left blank.