Please Note:

Bulk Utility

CT-V provides a command line bulk token utility that enables tokenization/detokenization of a very large data set at impressive speed. This utility is controlled through variables exposed in the configuration files (migration.properties, detokenization.properties and masking.properties). These are provided in the Tokenization/lib/ext directory.

CT-V Bulk Utility allows user to perform bulk tokenization/detokenization for plaintext. The following tasks can be performed through Bulk Utility:

Tokenization of plaintext from File-to-File:
- using token vault
- without using token vault
Tokenization of plaintext from database to database without using token vault.
Detokenization of tokens from File-to-File using token vault.
Note
Tokens generated without using the token vault (i.e. using masking.properties configuration file) cannot be detokenized.

The following table outlines the operations of CT-V Bulk Utility:

Operation	Operation Type	Properties File	Token Vault	Sequential Token Generation
Tokenization	File-to-File	`migration.properties`	Required	Can be set during token vault creation using KeySecure Classic or utilities.
Tokenization	File-to-File	`masking.properties`	Not required	Can be set through `masking.properties` file.
Tokenization	Database-to-Database	`masking.properties`	Not required	Can be set through `masking.properties` file.
Detokenization	File-to-File	`detokenization.properties`	Required	Can be set during token vault creation using KeySecure Classic or utilities.

The user specifies, on command prompt, the operation to be performed by the utility (either tokenization or detokenization) and the properties file to be used. For tokenization, migration.properties or masking.properties file is used; and for detokenization, detokenization.properties file is used.

Note

Bulk masking using masking.properties file is not supported for Informix database.
Bulk Utility with masking does not support custom formats for -ftf and -dtd operations.

CT-V Bulk Utility tokenizes large quantity of data in any of the below mentioned forms:

Data tokenization from clear text.
Data tokenization from already encrypted text. If the input data is already in encrypted form, the utility can decrypt this data and then tokenize it.
Data tokenization for clear text in the database table.
Data detokenization from delimited type input data.
Data detokenization from positional type input data.

CT-V Bulk Utility uses a multi-thread infrastructure to provide high-performance data transfer across a data pipeline. Users can modify certain parameters (Threads.BatchSize, Threads.CryptoThreads, Threads.TokenThreads and Threads.PollTimeout) in migration.properties, masking.properties and detokenization.properties files to improve performance in different bulk tokenization and detokenization scenarios. The utility provides live performance monitoring data as well as results (totals and performance data) for completed migration tasks to inform and optimize performance.

The utility works with data in flat files for File-to-File type operation. It is must to correctly populate the input data file, supplying the data to be tokenized/detokenized, and formatting it. The utility includes a data file reader that reads large flat files and supplies the data to the file reader thread. The utility can also read data directly from the database for Database-to-Database type operation.

Note

At this time, the utility has no capacity to identify and skip individual data elements because of errors. Files with an error are rejected. Make sure all of the data adheres to the descriptions set up in the properties file.

The user sends the plain/tokenized data through an input data file and sets the parameters in migration.properties, masking.properties or detokenization.properties files (also, SfntDbp.properties and SafeNetToken.properties files, if required). The migration.properties, masking.properties and detokenization.properties file allows the user to instruct the tokenization and detokenization of data by setting various parameters like format of input data file, location of input data file, token format, number of records to be tokenized at a time, location of output file, sequence in which columns will be displayed in output file, etc.

CT-V Bulk Utility tokenizes/detokenizes the data and saves it to the output file/destination database, as per the parameters set in the properties file. For tokenization/detokenization using the token vault, the output is also stored in the database.

Note

Multisite is not supported in CT-V Bulk Utility.

Supported Platforms

CT-V Bulk Utility is java based, so it must support all the platforms, but it has been tested and works well on the following platforms:

Windows 2008 R2 Enterprise Server 64-bit
Windows 2012 Enterprise Server 32-bit
Linux (RedHat6)

Supported Databases

The following section lists the databases supported by CT-V Bulk Utility:


Oracle 11g	Oracle 19c	SQL Server 2012	SQL Server 2017	MySQL 5.7
Oracle 12c	Oracle 21c	SQL Server 2014	SQL Server 2019	MySQL 8.0 Note: (For MySQL 8.0, the Java runtime environment version must be 8 or above)
Oracle 18c	SQL Server 2008	SQL Server 2016	MySQL 5.6	Informix 12.10

Supported Data Types for Bulk Migration (Without Using Token Vault)

The following table shows the data types supported for DB-to-DB masking for MS SQL, MySQL and Oracle:

Data Type	MS SQL	MySQL	Oracle
CHAR	Yes	Yes	Yes
VARCHAR	Yes	Yes	Yes
NCHAR	Yes	Yes	Yes
NVARCHAR	Yes	Yes	Yes
INT	Yes	Yes	Yes
SMALLINT	Yes	Yes	No
TINYINT	Yes	Yes	No
MEDIUMINT	No	Yes	No
REAL	Yes	No	No
BIGINT	Yes	Yes	No
DECIMAL	Yes	Yes	Yes
FLOAT	Yes	Yes	Yes
NUMERIC	No	No	No
DOUBLE	No	Yes	No
DATE	Yes	Yes	Yes
DATETIME	Yes	Yes	No
TIMESTAMP	No	Yes	Yes

Note

The schema for the destination database will be created by the user and the schema should be same as that of the source database.
For sequential token format, if there is a duplicate value in a batch of the input data, then there is a break in the sequence and a skip of value in the next batch. For example, the input file has the data 1, 2, 2, 3, 4, 5, 6, 7 and is run in batches of 4.
The sequential output is generated in the following manner: 11, 12, 12, 13, 15, 16, 17, 18. The duplicate value 2 has resulted in tokenized value 15 instead of 14 for the corresponding input value 4.
Four token formats: RANDOM_TOKEN, FIRST_SIX_LAST_FOUR_TOKEN, LAST_FOUR_TOKEN and LAST_SIX_TOKEN are supported for Date data type.
For Oracle database, it is not recommended to use the sequential token format for Date data type, as the tokens will produce a change in the millisecond field of Date. The millisecond is not stored in Oracle for Date data type (directly) and the user will get same values for the date columns.
If any data type apart from the ones mentioned in the preceding table is provided, then the data gets copied to the destination database table.

Suggest A Change

Bulk Utility

Supported Platforms

Supported Databases

Supported Data Types for Bulk Migration (Without Using Token Vault)

On this page