Administration
CT-V is a Java based data tokenization application which can be run as a web service or an API. CT-V consists of:
Tokens: Unique values created to take place of a valuable plaintext (like credit card number and email address) normally stored in your database. Tokens are the same data type as the plaintext.23 Credit card numbers can maintain whitespace and dashes used for formatting. Email addresses maintain @ and . characters. Valid dates remain valid dates. Tokens can be of any length allowed by your database.
Token Vault: Table that stores tokens, with their corresponding cipher text, and the mac values. There can be multiple token vaults in a database. All tokens in a vault must use the same token format.
Key Table: The table that maps token vaults to encryption keys; one key table per database.
Java API: Offers public methods to create, retrieve, delete tokens, key rotation, and create token formats.
Java Web Service: Allows you to create, retrieve, and delete tokens, and to create token formats through a web service.
How to Use
Configure the CT-V by creating a user, an encryption key, and a MAC key on the Key Manager. Use the KeySecure Classic or utilities to create the token vault in your database. Run the installation program (TokenizationInstaller-8.12.4.000.jar). These steps are detailed in Installation.
You can then call the CT-V API and/or web service.
To use the API in your application, import the package and create a TokenService object. Your application can then create, retrieve, and delete tokens, and create new token formats. The insert()
, get()
, deleteToken()
, and deleteValue()
methods allow for single or batch operations. Using one TokenService object, your application can then use these methods in combination or multiple times. You do not need to create a new TokenService object each time you want to use one of the other APIs; you can simply re- use the same TokenService object.
When finished with the CT-V, call closeService()
to close the connections with the Key Manager and the database. The complete API is detailed in the javadoc accompanying the software and in Using CipherTrust Vaulted Tokenization Java APIs.
Note
When using the CT-V in a multithreaded application, we recommend that you use no more than 10 threads per single CPU machine.
To call the SOAP web service for Java Developers directly, reference the endpoint URL (http://<Host>:8080/axis2/services/SafeNetTokenizer) in your application (this assumes you are using the default Tomcat port, 8080). Your application can then call the methods to create, retrieve, and delete tokens, and create new token formats.
The complete webservice API is detailed in SOAP Web Service for Java Developers.
To call the SOAP web service over XML, review the WSDL and build your stubs in accordance with those definitions. The WSDL is available at http://<Host>:8080/axis2/services/SafeNetTokenizer?wsdl.
Features
Aside from the standard features of creating, retrieving, and deleting tokens, the CT-V offers the following additional features:
Support for .Net API and WebService: CT-V Java API methods can now be called using equivalent .Net API methods and Web Services deployed with IIS 6/7/8. This provides CT-V with language interoperability across several programming languages, including C#, C++/CLI, JScript .Net, and VB.Net, among others.
Configurable Token Lengths: Tokens can be either the same size as the input data, or any other size. Token size is determined by the token format and cannot be larger than the TOKEN column you configure for the token vault.
Customer-Designed Token Formats: You can create your own token formats using the createNewFormat() method . The createNewFormat() method uses Java regular expressions to simplify the process of creating a new token format. New formats are persistent: they are stored in the SFNT_TOKEN_FORMAT table in the database. For more information, see Creating New Token Formats.
Customer-Designed Masking Formats: You can create your own masking formats using the createMaskingFormat() method. New formats are persistent: they are stored in the SFNT_TOKEN_FORMAT table in the database.
Database Connection Pooling: When the CT-V is initiated from the API or web service, it creates a connection pool that stores connections to the token vault’s database. You can configure the minimum and maximum number of connections allowed in the pool, as well as set the maximum idle time for a connection.
Support for Offline Tokenization with local cryptographic operations: CT-V supports local encryption using symmetric key caching. This means that all token generation operations, including cryptographic operations, can be processed locally. Performing cryptographic operations locally can improve performance, since it reduces round trips to the server.
Note
Persistent key caching feature is deprecated in CT-V.
Database Support: The CT-V, as delivered, supports Oracle database, MySQL database, and Informix database over TCP; and SQLServer database over TCP and SSL. The cloud based database Amazon Aurora (MySQL) is also supported. Note also that our framework enables you to configure CT-V to work with other database types. Contact Thales Professional Services team for more information. The APIs used to provide support for other databases are documented in CipherTrust Vaulted Tokenization for Other Databases. Support for other non supported databases can be made available through Thales Professional Services.
Support for Unicode: CT-V provides end-to-end support for Unicode (UTF-16). When creating tokens for a Token Vault database, you can pass any input values (that is, any character in any of the world's living languages). CT-V can recognize the Unicode encoding of plaintext characters and select the appropriate Unicode characters (from the same character sets) to use when encoding the token. You also have the option to identify specific sets of Unicode characters (codePointValues) in the tokenization input, and to specify sets of Unicode characters to be used when creating the corresponding tokens. For details, refer to the unicode.properties file and the appropriate unicode sample file for your development environment.
Note
Unicode is not supported for Informix database.
NonIdempotentVaults: Non-idempotent tokens are created for the colon-delimited vaults listed against the NonIdempotentVaults parameter in the SafeNetToken.properties file. For the specified vaults in the list, new and unique token will be generated each time even if the same plain text value is used as an input again and again. Using this feature means the CT-V methods that accept cleartext to return or delete tokens will generate an error. If the token vault name is passed to one of the getToken or deleteValue CT-V methods, these methods will throw an exception noting that they're not supported.
Note
From this release onward, non-idempotent token vaults are not supported for the MySQL database.
Create Tokens while Bypassing Token Vault: Want to use existing production data in your test database without worrying about managing confidential customer information? You can tokenize the sensitive information without storing the encrypted data in the token vault by using the mask() method. mask() leaves the token vault out of the process, so the token cannot be traced back to the original plaintext data. Personally Identifiable Information, such as email addresses and credit card numbers, can be tokenized and used in your test environments, giving you useful, real-world samples that reflect your production data.
GUI for Token Vault Configuration: The KeySecure Classic UI includes a separate page for creating and viewing token vaults. The GUI enables you to set the maximum token size, select the encryption key, and assign the tablespace for the token vault.
Installation Program: Use the TokenizationInstaller-8.12.4.000.jar to perform installation of the API or web service for Java Developers. See Installing CipherTrust Vaulted Tokenization. The .Net API or the web service is installed automatically by an InstallShield Wizard process.
Pre-Defined Token Formats: The CT-V includes a list of predefined token formats. All tokens in a vault must use the same format. Token formats are further explained in Using Token Formats.
Bulk Utility: CT-V provides a command line utility that enables you to tokenize/detokenize a very large data set. Bulk Utility performs broadly two kind of operations: file-to-file tokenization/detokeniation operation using token vault; and file-to-file/database-to-database tokenization without using the token vault. For more details, please refer to CT-V Bulk Utility User Guide for details.
Multi-Site Support: The multi-site feature enables customers to control how CT-V behaves when deployed across multiple data centers. Customers can use the database partitioning scheme to determine how tokens are created, propagated, and retrieved. There is no limit to the number of data centers, although all sites must use the same database login credentials.
For vaults that employ the partitioning scheme, CT-V uses the plaintext to determine which data center will do the initial insert. When propagating the token to vaults on other sites, CT-V propagates it after completing the initial insert. When retrieving tokens, CT-V first searches the local data center. If not found, it searches the remote sites. Multi-Site Support is further explained in Multi-Site Support.
Logging: When a token operation is performed (create, get, delete, etc.), the CT-V records the operation name, the client’s IP address, username, and the date and time to a log file on the client machine. Optionally, the token value itself can be logged. This log can be uploaded to the Key Manager Client Event Log. For details, see Configuring Logging Features.
SysLog Support: CT-V supports logging in syslog server over TCP and UDP protocols. It can be initialized by setting the SysLog_IP, SysLog_Port and SysLog_Protocol parameters in the IngrianNAE.properties file. Either of syslog logging or the default file based logging of CT-V can be used at a time. If both values are specified, then SysLog_IP will be used.
Note
IPv6 is not supported for Syslog server.
Luhn Checking: The CT-V can apply a Luhn check to numeric tokens in order to ensure that the token is a valid number combination that can be used in client applications. Likewise, the CT-V can apply a negative Luhn check to ensure that tokens cannot be mistaken for cleartext values.
CustomData: The customer-specific data associated with the plaintext value(s). The custom data can be any data that defines or outlines the domain of the plaintext or any other meaningful information of the plain text as per user’s requirement. The custom data cannot be retrieved through APIs after the tokens are generated and hence any meta data for the TokenProperty should not be used as custom data. For Oracle, SQL Server, and Informix, the uniqueness of a token is determined by the combination of plaintext and customdata. For example, if you insert the same plaintext (named P1) with different customdata (named C1 and C2), CT-V will always generate a unique token as shown here:
Plaintext CustomData Token P1 C1 T1 P1 C2 T2 However, for MySQL, the uniqueness of the token is determined only by the plaintext. Tokenization of same plaintext with different customdata is not supported.
Smart Check: Enables you to bypass errors in bulk tokenization or detokenization input. Previously, if you used a process that tokenized or detokenized bulk data, an array containing a few invalid input values or one null element could cause the failure of a process acting on thousands of other valid values. Java and .Net API Methods and Web Services now enable your bulk process to skip bad data in an otherwise correct array and to continue processing. New versions of the insert() and get() methods are overloaded, and feature a saveExceptions parameter. When this parameter is set to true, the enhanced method is applied. TmResult provides information regarding the failure or failures. Using TmResult, you can determine whether errors occurred, get the type of error and find any bad data. Bulk tokenization problems due to small data errors are isolated and minimized, and recovery is easier and faster.
Last Access Date: The LastAcessDate column in the Token Vault table stores the date on which the token was last accessed. By default, this feature is enabled and can be disabled by adding UpdateLastAccessDate parameter with a value ‘No’ in SafeNetToken.properties file.
Count of Deleted Token/Tokens: The overloaded method DeleteTokenEx() (single and batch tokens) is applicable for both API and Web services. On calling DeleteTokenEx() method, the specified tokens get deleted from database and the method returns total number of tokens deleted.
Obfuscate Password: This feature allows you to use obfuscated password wherever applicable. You can obfuscate password using obfuscator utility. To use this feature, the PasswordObfuscation parameter in SafeNetToken.properties file must be set to ‘true’. By default, this parameter is disabled.
Credential Obfuscation: This feature allows you to use obfuscated Key Manager and database credentials. You can obfuscate Key Manager and database credentials using obfuscator utility. To use this feature, the CredentialObfuscation parameter in the SafeNetToken.properties file must be set to ‘true’. By default, this parameter is disabled.
SearchPurge Utility: This utility is used to search and purge the tokens/values. The search/purge can be conducted based on the parameters specified in the Utility.properties file. Refer to SearchPurge Utility for details.
SAP Integration: Thales provides an integration of CT-V with SAP. Refer to CT-V and SAP Integration Guide for details.
Role Based Access: The access to CT-V can be granted based on the basis of user’s role. Some users may have permission to create tokens (tokenize) while other users may have permission to retrieve plain text (detokenize) from tokens. To do the tokenization operation, users must have the encrypt permission on the cipher key and hash permission on the hash-key. For detokenization operation, users must have the decrypt permission on the cipher key and hash permission on the hash-key.
TokenProperty: TokenProperty is a user defined property of a token or tokens. It consists of five different properties: Priority, Status, Property 1, Property 2 and CustomTokenProperty.
Priority (3 characters) and Status (4 characters) are christened by Thales as per EMVCo LLC guidelines. The properties Priority and Status can define the priority level and status of the token. However, users are free to define and assign values to these as per requirement.
Property 1 (5 characters) and Property 2 (4 characters) are for future use and users can christen these and assign values as per requirement.
CustomTokenProperty differs from the first four in that it is used to store long string of up to 255 characters including special characters.
User can search tokens by using any of the five properties, order of the property is to be specified. The following diagram outlines the TokenProperty and characters allowed in each one.
Note
A customTokenProperty only with spaces will be treated as null.
CheckTokenHash: CT-V provides the feature to check the duplicacy of token generated against the input plain text values and the existing tokens in the token vault. To use this feature, the CheckTokenHash and NumTMThreads parameters in SafeNetToken.properties file must be set to ‘true’ and ‘1’ respectively.
Tip
For better performance, use batch size of 10,000 or less when using this feature.
Verhoeff Checking: CT-V provides the feature to ensure that the generated tokens are not compliant to Verhoeff’s algorithm. To use this feature, the VerhoeffCheck parameter in the SafeNetToken.properties file must be set to ‘true’. By default, the VerhoeffCheck parameter is set to ‘false’. For example, tokenizing a Aadhar number with VerhoeffCheck=true will ensure that the generated token does not match any Aadhar number (as Aadhar numbers are Verhoeff compliant).
RestrictedTokenLeadingChars: CT-V provides the feature to ensure that the generated tokens do not start with the characters specified for this parameter. For example, if the value of the RestrictedTokenLeadingChars parameter in the SafeNetToken.properties file is set to 77, the generated token will not start with 77. By default, the value of RestrictedTokenLeadingChars parameter is set to ‘0’. This parameter is not applicable in the following scenarios:
If you want to preserve the leading characters of the plain text in the generated tokens.
If you want to set the lead mask characters for the generated tokens.
Silent Installation: CT-V provides the feature to install the application in silent mode for Java API and Web Services. To use this feature, either configure the SilentInstallation.properties file or specify the parameters on the command line interface. For more information, see Silent Installation.
Note
Silent Installation for Java API and Web Services works with Chef Scripts.
Support for Multiple Databases: CT-V provides the feature to configure and use token vaults in multiple databases in the Java APIs and REST web services. You can use this feature if you have independent token vaults in multiple databases. In Java API, you can configure database by providing database properties such as dbHost, dbPort, and dbName while creating the TokenService instance. Refer to TokenService Constructor with Database Properties for details. In REST webservices, a file, named,
Databases.json
is used to configure the mapping of token vaults and their respective databases. Refer to Configuring Multiple Databases in REST APIs for details. This feature is supported for MySQL database only._JdbcUrlOverride: This is an optional parameter to connect CT-V with the database. It overrides the existing jdbc url for a database connection. To use this parameter, set the_JdbcUrlOverride property in the SafeNetToken.properties file. Following are the jdbc url syntax for different databases:
For MySQL: _JdbcUrlOverride=jdbc:mysql://<DB_HOST>:<PORT>/<database_name>?useUnicode=true&characterEncoding=UTF-8&useSSL=false
For Informix: _JdbcUrlOverride=jdbc:informix-sqli://<DB_HOST>:<PORT>/<database_name>:INFORMIXSERVER=<instanceName>
For SQL Server: _JdbcUrlOverride=jdbc:sqlserver://<DB_HOST>:<PORT>;sendStringParametersAsUnicode=true;selectMethod=direct;responseBuffering=full;databaseName=<database_name>;encrypt=true;trustServerCertificate=true
For Oracle: _JdbcUrlOverride=jdbc:oracle:thin:@<DB_HOST>:<PORT>:<database_name>
Supported Configuration
To use the CT-V as an API, you must configure
Product | Version |
---|---|
CipherTrust Manager | 2.2 or higher. |
Client Application Environment | Sun JVM version 8 (minimum 1.8.0_111), 10, 11, or 17. |
Database | Oracle 11g, 12c, 18c, 19c, 21c; SQL Server 2008, 2012, 2014, 2016, 2017, 2019; MySQL 5.6, 5.7, 8.0 (For MySQL 8.0, the Java runtime environment version must be 8 or above.); and Informix 12.10. |
Policy files | The Java encryption policy files for unlimited strength ciphers. |
You can configure CT-V to support databases other than the ones specified above. Refer to Thales Professional Services for full details. In this document, CipherTrust Vaulted Tokenization for Other Databases, provides the necessary API documentation. With respect to CT-V, Thales level of support for implementations using other databases will depend on specific agreements that you resolve with Thales Professional Services.
To use the CT-V as a SOAP web service, you must configure:
Product | Version |
---|---|
CipherTrust Manager | 2.2 or higher. |
Web server | Axis2 1.7.8 and Tomcat (versions 6 to 9 supported). |
Database | Oracle 11g, 12c, 18c, 19c, 21c; SQL Server 2008, 2012, 2014, 2016, 2017, 2019; MySQL 5.6, 5.7, 8.0 (For MySQL 8.0, the Java runtime environment version must be 8 or above.); and Informix 12.10. |
Policy files | The Java encryption policy files for unlimited strength ciphers. |
To use the CT-V as a REST web service, you must configure:
Product | Version |
---|---|
The CipherTrust Manager | 2.2 or higher. |
The web server | Tomcat server (versions 6 to 9 supported). |
The CXF jar. | |
The database | Oracle11g, 12c, 18c, 19c, 21c; SQL Server 2008, 2012, 2014, 2016, 2017,2019; MySQL 5.6, 5.7, 8.0 (For MySQL 8.0, the Java runtime environment version must be 8 or above.); and Informix 12.10. |
Policy files | The Java encryption policy files for unlimited strength ciphers. |
Tip
Run the web services and database services on separate platforms and configure them in clusters for redundancy.
CT-V Bulk Utility
CT-V provides a command line bulk token utility that enables tokenization/detokenization of a very large data set at impressive speed. This utility is controlled through variables exposed in the configuration files (migration.properties
, detokenization.properties
and masking.properties
). These are provided in the Tokenization/lib/ext
directory.
CT-V Bulk Utility allows user to perform bulk tokenization/detokenization for plaintext. The following tasks can be performed through Bulk Utility:
Tokenization of plaintext from File-to-File:
using token vault
without using token vault
Tokenization of plaintext from database to database without using token vault.
Detokenization of tokens from File-to-File using token vault.
Note
Tokens generated without using the token vault (i.e. using
masking.properties
configuration file) cannot be detokenized.
The following table outlines the operations of CT-V Bulk Utility:
Operation | Operation Type | Properties File | Token Vault | Sequential Token Generation |
---|---|---|---|---|
Tokenization | File-to-File | migration.properties | Required | Can be set during token vault creation using KeySecure Classic or utilities. |
Tokenization | File-to-File | masking.properties | Not required | Can be set through masking.properties file. |
Tokenization | Database-to-Database | masking.properties | Not required | Can be set through masking.properties file. |
Detokenization | File-to-File | detokenization.properties | Required | Can be set during token vault creation using KeySecure Classic or utilities. |
The user specifies, on command prompt, the operation to be performed by the utility (either tokenization or detokenization) and the properties file to be used. For tokenization, migration.properties
or masking.properties
file is used; and for detokenization, detokenization.properties
file is used.
Note
Bulk masking using masking.properties
file is not supported for Informix database.
CT-V Bulk Utility tokenizes large quantity of data in any of the below mentioned forms:
Data tokenization from clear text.
Data tokenization from already encrypted text. If the input data is already in encrypted form, the utility can decrypt this data and then tokenize it.
Data tokenization for clear text in the database table.
Data detokenization from delimited type input data.
Data detokenization from positional type input data.
CT-V Bulk Utility uses a multi-thread infrastructure to provide high-performance data transfer across a data pipeline. Users can modify certain parameters (Threads.BatchSize
, Threads.CryptoThreads
, Threads.TokenThreads
and Threads.PollTimeout
) in migration.properties
, masking.properties
and detoknization.properties
files to improve performance in different bulk tokenization and detokenization scenarios. The utility provides live performance monitoring data as well as results (totals and performance data) for completed migration tasks to inform and optimize performance.
The utility works with data in flat files for File-to-File type operation. It is must to correctly populate the input data file, supplying the data to be tokenized/ detokenized, and formatting it. The utility includes a data file reader that reads large flat files and supplies the data to the file reader thread. The utility can also read data directly from the database for Database-to-Database type operation.
Note
At this time, the utility has no capacity to identify and skip individual data elements because of errors. Files with an error are rejected. Make sure all of the data adheres to the descriptions set up in the properties file.
The user sends the plain/tokenized data through an input data file and sets the parameters in migration.properties
, masking.properties
or detokenization.properties
files (also, SfntDbp.properties
and SafeNetToken.properties
files, if required). The migration.properties
, masking.properties
and detokenization.properties
file allows the user to instruct the tokenization and detokenization of data by setting various parameters like format of input data file, location of input data file, token format, number of records to be tokenized at a time, location of output file, sequence in which columns will be displayed in output file, etc.
CT-V Bulk Utility tokenizes/detokenizes the data and saves it to the output file/destination database, as per the parameters set in the properties file. For tokenization/detokenization using the token vault, the output is also stored in the database.
Note
Multisite is not supported in CT-V Bulk Utility.
Supported Platforms
CT-V Bulk Utility is java based, so it must support all the platforms, but it has been tested and works well on the following platforms:
Windows 2008 R2 Enterprise Server 64-bit
Windows 2012 Enterprise Server 32-bit
Linux (RedHat6)
Supported Databases
The following section lists the databases supported by CT-V Bulk Utility:
Oracle 11g
Oracle 12c
Oracle 18c
Oracle 19c
Oracle 21c
SQL Server 2008
SQL Server 2012
SQL Server 2014
SQL Server 2016
SQL Server 2017
SQL Server 2019
MySQL 5.6
MySQL 5.7
MySQL 8.0 (For MySQL 8.0, the Java runtime environment version must be 8 or above.)
Informix 12.10
Supported Data Types for Bulk Migration (Without Using Token Vault)
The following table shows the data types supported for DB-to-DB masking for MS SQL, MySQL and Oracle:
Data Type | MS SQL | MySQL | Oracle |
---|---|---|---|
CHAR | Yes | Yes | Yes |
VARCHAR | Yes | Yes | Yes |
NCHAR | Yes | Yes | Yes |
NVARCHAR | Yes | Yes | Yes |
INT | Yes | Yes | Yes |
SMALLINT | Yes | Yes | No |
TINYINT | Yes | Yes | No |
MEDIUMINT | No | Yes | No |
REAL | Yes | No | No |
BIGINT | Yes | Yes | No |
DECIMAL | Yes | Yes | Yes |
FLOAT | Yes | Yes | Yes |
NUMERIC | No | No | No |
DOUBLE | No | Yes | No |
DATE | Yes | Yes | Yes |
DATETIME | Yes | Yes | No |
TIMESTAMP | No | Yes | Yes |
Note
The schema for the destination database will be created by the user and the schema should be same as that of the source database.
For sequential token format, if there is a duplicate value in a batch of the input data, then there is a break in the sequence and a skip of value in the next batch. For example, the input file has the data 1, 2, 2, 3, 4, 5, 6, 7 and is run in batches of 4.
The sequential output is generated in the following manner: 11, 12, 12, 13, 15, 16, 17, 18. The duplicate value 2 has resulted in tokenized value 15 instead of 14 for the corresponding input value 4.
Four token formats:
RANDOM_TOKEN
,FIRST_SIX_LAST_FOUR_TOKEN
,LAST_FOUR_TOKEN
andLAST_SIX_TOKEN
are supported for Date data type.For Oracle database, it is not recommended to use the sequential token format for Date data type, as the tokens will produce a change in the millisecond field of Date. The millisecond is not stored in Oracle for Date data type (directly) and the user will get same values for the date columns.
If any data type apart from the ones mentioned in the preceding table is provided, then the data gets copied to the destination database table.