Appendix
Information types
Infotype Name | Category | Region |
---|---|---|
AES Key | Personal Data | Global |
American Express | Financial | Global |
Artifactory Token | Personal Data | Global |
Australian Bank Account Number | Financial | Oceania |
Australian Business Number | Financial | Oceania |
Australian Company Number | Financial | Oceania |
Australian Driver License Number | Personal Data | Oceania |
Australian Healthcare Identifier - Organisation | Medical | Oceania |
Australian Individual Healthcare Identifier | Medical | Oceania |
Australian Mailing Address | Personal Data | Oceania |
Australian Medicare Card | Medical | Oceania |
Australian Medicare Provider | Medical | Oceania |
Australian Passport Number | Personal Data | Oceania |
Australian Tax File Number | National ID | Oceania |
Australian Telephone Number | Personal Data | Oceania |
Austrian Driver License Number | Personal Data | Europe |
Austrian Mailing Address | Personal Data | Europe |
Austrian Passport Number | Personal Data | Europe |
Austrian Personalausweis | National ID | Europe |
Austrian SSN | National ID | Europe |
Austrian Telephone Number | Personal Data | Europe |
AWS Key ID | Personal Data | Global |
Azure Storage Key | Personal Data | Global |
Basic Auth Secret | Personal Data | Global |
Belgian Driver License Number | Personal Data | Europe |
Belgian eID | National ID | Europe |
Belgian National Number | National ID | Europe |
Belgian Passport Number | Personal Data | Europe |
Belgian Telephone Number | Personal Data | Europe |
Blood Type | Medical | Global |
Brazilian CPF | National ID | Americas |
Brazilian Registro Geral | National ID | Americas |
Bulgarian EGN | National ID | Europe |
Canadian Bank Account Number | Financial | Americas |
Canadian Health Service Number | Medical | Americas |
Canadian Mailing Address | Personal Data | Americas |
Canadian Passport Number | Personal Data | Americas |
Canadian Personal Health Identification Number (PHIN) | Medical | Americas |
Canadian Social Insurance Number | National ID | Americas |
Canadian Telephone Number | Personal Data | Americas |
Canadian Driver License Number | Personal Details | Americas |
Chilean RUN | National ID | Americas |
China Union Pay | Financial | Global |
Cloudant Credentials | Personal Data | Global |
Credentials password | Personal Data | Global |
Credentials username | Personal Data | Global |
Croatian OIB | National ID | Europe |
Cypriot Passport Number | Personal Data | Europe |
Czech Republic RC | National ID | Europe |
Danish CPR | National ID | Europe |
Danish Driver License Number | Personal Data | Europe |
Danish Passport Number | Personal Data | Europe |
Date Of Birth (under 18) | Personal Data | Global |
Date Of Birth | Personal Data | Global |
DB2 Credentials | Personal Data | Global |
DH Key | Personal Data | Global |
Diners Club | Financial | Global |
Discord Token | Personal Data | Global |
Discover | Financial | Global |
Drug Enforcement Agency Number | Medical | Americas |
DSA Public Key | Personal Data | Global |
Dutch Burgerservicenummer | National ID | Europe |
Dutch Driver License Number | Personal Data | Europe |
Dutch NIK | National ID | Europe |
Dutch Passport Number | Personal Data | Europe |
Dutch Telephone Number | Personal Data | Europe |
ECC Public Key | Personal Data | Global |
Email addresses | Personal Data | Global |
Ethnicity (English) | Personal Data | Global |
European EHIC | Medical | Europe |
Finnish HETU | National ID | Europe |
French Carte Vitale | National ID | Europe |
French CNI | National ID | Europe |
French Driver License Number | Personal Data | Europe |
French INSEE | National ID | Europe |
French Mailing Address | Personal Data | Europe |
French Passport Number | Personal Data | Europe |
French Telephone Number | Personal Data | Europe |
Gambian National Identification Number | National | Africa |
Gender (English) | Personal Data | Global |
Generic Bank Account Number | Financial | Global |
German Driver License Number | Personal Data | Europe |
German Mailing Address | Personal Data | Europe |
German Passport Number | Personal Data | Europe |
German Personalausweis | National ID | Europe |
German Telephone Number | Personal Data | Europe |
Github Token | Personal Data | Global |
Greek AFM | National ID | Europe |
Greek AMKA | National ID | Europe |
Greek Passport Number | Personal Data | Europe |
Hong Kong ID | National ID | Asia |
Hungarian Personal ID | National ID | Europe |
IBM Cloud IAM Key | Personal Data | Global |
IBM COS HMAC Credentials | Personal Data | Global |
Icelandish Kennitala | National ID | Europe |
Indian Aadhaar Number | National ID | Asia |
Indian Address | Personal Data | Asia |
Indian Bank Account Number | Financial Data | Asia |
Indian Driving License Number | Personal Data | Asia |
Indian Marital Status | Personal Data | Asia |
Indian MGNREGA Job Card ID | National ID | Asia |
Indian Name | Personal Data | Asia |
Indian PAN (Juridical) Number | National ID | Asia |
Indian Passport Number | Personal Data | Asia |
Indian Phone Number | Personal Data | Asia |
Indian Ration Card Number | National ID | Asia |
Indian Voter ID | National ID | Asia |
International Bank Account Number (IBAN) | Financial | Global |
IP Address | Personal Data | Global |
Iranian National Identification Number | National | Asia |
Irish Driver License Number | Personal Data | Europe |
Irish Passport Card Number | Personal Data | Europe |
Irish Passport Number | Personal Data | Europe |
Irish Personal Public Service Number | National | Europe |
Irish Telephone Number | Personal Data | Europe |
ISO8583 message with PAN | Financial | Global |
Israeli Bank Account Number | Financial | Asia |
Israeli Identity Number | National ID | Asia |
Italian CARTA D'IDENTITÀ | National ID | Europe |
Italian Codice Fiscale | National ID | Europe |
Italian Driver License Number | Personal Data | Europe |
Italian Mailing Address | Personal Data | Europe |
Italian Passport | Personal Data | Europe |
Italian Telephone Number | Personal Data | Europe |
Japanese Bank Account Number | Financial | Asia |
Japanese Driver License Number | Personal Data | Asia |
Japanese Passport Number | Personal Data | Asia |
Japanese Resident Registration Number | National | Asia |
Japanese Social Insurance Number (SIN) | National | Asia |
JCB | Financial | Global |
JSON Web Token | Personal Data | Global |
Laser | Financial | Global |
Latvian Personas Kods | National ID | Europe |
License Number | Personal Data | Global |
Login credentials | Personal Data | Global |
Luxembourg Driver License Number | Personal Data | Europe |
Luxembourg ID | National ID | Europe |
Luxembourg Passport Number | Personal Data | Europe |
Luxembourg Phone Number | Personal Data | Europe |
MAC Address | Personal Data | Global |
Macedonian UMCN | National ID | Europe |
Maestro | Financial | Global |
Mailchimp Access Key | Personal Data | Global |
Malaysian NRIC | National ID | Asia |
Maltese eID | National ID | Europe |
Mastercard | Financial | Global |
Medicare Beneficiary Identifier (MBI) | Medical | North America |
Mexican CURP | National ID | Americas |
Mongo DB Credentials | Personal Data | Global |
MSSQL Database Credentials | Personal Data | Global |
MySQL Database Credentials | Personal Data | Global |
New Zealand Inland Revenue Number | National ID | Oceania |
New Zealand Mailing Address | Personal Data | Oceania |
New Zealand Passport Number | Personal Data | Oceania |
New Zealand Telephone Number | Personal Data | Oceania |
Norwegian Birth Number | National ID | Europe |
Norwegian Driver License Number | Personal Data | Europe |
Norwegian Passport Number | Personal Data | Europe |
NPM token | Personal Data | Global |
Oracle Database Credentials | Personal Data | Global |
Passport Number | Personal Data | Global |
Peoples Republic of China ID | National ID | Asia |
Personal Names (Austrian) | Personal Data | Europe |
Personal Names (Belgian) | Personal Data | Europe |
Personal Names (English) | Personal Data | Global |
Personal Names (French) | Personal Data | Europe |
Personal Names (German) | Personal Data | Europe |
Personal Names (Italian) | Personal Data | Europe |
Personal Names (Netherlands) | Personal Data | Europe |
Personal Names (Polish) | Personal Data | Europe |
Personal Names (Portuguese) | Personal Data | Europe |
PGP Public Key | Personal Data | |
Polish Driver License Number | Personal Data | Europe |
Polish Identity Card | National ID | Europe |
Polish Mailing Address | Personal Data | Europe |
Polish Passport Number | Personal Data | Europe |
Polish PESEL | National ID | Europe |
Polish Telephone Number | Personal Data | Europe |
Portuguese Citizen's Card | National ID | Europe |
Portuguese Driver License Number | Personal Data | Europe |
Portuguese Fiscal Number | National ID | Europe |
Portuguese Identity Number | National ID | Europe |
Portuguese Mailing Address | Personal Data | Europe |
Portuguese Passport Number | Personal Data | Europe |
Portuguese Phone Number | Personal Data | Europe |
PostgreSQL Database Credentials | Personal Data | Global |
Private Key | Personal Data | Global |
Private Label Card | Financial | Global |
Profanity (English) | Personal Data | Global |
Redis Credentials | Personal Data | Global |
Religion (English) | Personal Data | Global |
Romanian Identity Card | National ID | Europe |
Romanian Numerical Personal Code | National ID | Europe |
RSA Public Key | Personal Data | Global |
Saudi Arabia National ID | National ID | Asia |
SendGrid API Key | Personal Data | Global |
Serbian UMCN | National ID | Europe |
Singaporean Mailing Address | Personal Data | Asia |
Singaporean NRIC | National ID | Asia |
Singaporean Passport Number | Personal Data | Asia |
Singaporean Telephone Number | Personal Data | Asia |
Singaporean Bank Account Number | Persona Data | Asia |
Singaporena Driver License Number | Personal Data | Asia |
Slack Token | Personal Data | Global |
Slovakian RC | National ID | Europe |
Slovenian EMSO | National ID | Europe |
SoftLayer Credentials | Personal Data | Global |
South African Identity Number | National ID | Africa |
South Korean Corporation Registration Number (법인등록번호) | Financial | Asia |
South Korean Driver License Number | Personal Data | Asia |
South Korean Foreigner Number | National ID | Asia |
South Korean Gwangju Bank (광주은행) Account Number | Financial | Asia |
South Korean Jeju Bank (제주은행) Account Number | Financial | Asia |
South Korean Jeonbuk Bank (전북은행) Account Number | Financial | Asia |
South Korean KB Bank (국민은행) Account Number | Financial | Asia |
South Korean KEB Hana Bank (KEB하나은행) Account Number | Financial | Asia |
South Korean NH Bank (농협은행) Account Number | Financial | Asia |
South Korean Passport | Personal Data | Asia |
South Korean Phone Number | Personal Data | Asia |
South Korean RRN | National ID | Asia |
South Korean Shinhan Bank (신한은행) Account Number | Financial | Asia |
South Korean Taxpayer Identification Number (사업자등록번호) | Financial | Asia |
Spanish DNI | National ID | Europe |
Spanish Driver License Number | Personal Data | Europe |
Spanish NIE | National ID | Europe |
Spanish Passport Number | Personal Data | Europe |
Spanish Social Security Number | National ID | Europe |
Spanish Telephone Number | Personal Data | Europe |
Square Oauth Secret | Personal Data | Global |
Sri Lankan National Identity Card | National ID | Asia |
SSH Private Key | Personal Data | Global |
SSH Public Key | Personal Data | Global |
Stripe Access Key | Personal Data | Global |
Swedish Driver License Number | Personal Data | Europe |
Swedish Nationellt ID-kort | National ID | Europe |
Swedish Passport Number | Personal Data | Europe |
Swedish Personnummer | National ID | Europe |
SWIFT Code | Financial | Global |
Swiss Social Security Number | National ID | Europe |
Taiwanese ID | National ID | Asia |
TDES Key | Personal Data | Global |
Thai Population Identification Code | National ID | Asia |
Troy | Financial | Global |
Turkish Identification Number | National ID | Europe |
Turkish Tax Identification Number | National ID | Europe |
Turkish Telephone Number | Personal Data | Europe |
Twilio API Key | Personal Data | Global |
United Arab Emirates ID | National ID | Asia |
United Kingdom Community Health Index | Medical | Europe |
United Kingdom Driver License Number | Personal Data | Europe |
United Kingdom Electoral Roll Number | Personal Data | Europe |
United Kingdom Health and Care Number | Medical | Europe |
United Kingdom Mailing Address | Personal Data | Europe |
United Kingdom National Health Service Number | Medical | Europe |
United Kingdom NI Number | National ID | Europe |
United Kingdom Passport Number | Personal Data | Europe |
United Kingdom Self Assessment UTR Number | National ID | Europe |
United Kingdom Telephone Number | Personal Data | Europe |
United Kingdom VAT Number | Financial | Europe |
United States Bank Account Number | Financial | Americas |
United States Driver License Number | Personal Data | Americas |
United States Health Insurance Claim Number | Medical | Americas |
United States Health Plan Identifier | Medical | Americas |
United States Individual Taxpayer Identification Number (ITIN) | National ID | Americas |
United States Mailing Address | Personal Data | Americas |
United States National Provider Identifier | Medical | Americas |
United States Passport Card Number | Personal Data | North America |
United States Passport Number | Personal Data | North America |
United States Routing Transit Number | Financial | Americas |
United States Social Security Number | National | Americas |
United States Telephone Number | Personal Data | Americas |
Visa | Financial | Global |
Yugoslavia UMCN | National ID | Europe |
DDC ML information types
Infotype Name | Category | Region |
---|---|---|
AWS Access Key ID (ML) | Cryptographical | Global |
AWS Secret Access Key (ML) | Cryptographical | Global |
Mailing Address (ML) | Personal Data | Global |
Email addresses (ML) | Personal Data | Global |
GPS (ML) | Personal Data | Global |
Nationality (ML) | Personal Data | Global |
Organization (ML) | Personal Data | Global |
Personal Names (ML) | Personal Data | Global |
Political Party (ML) | Personal Data | Global |
Religion (ML) | Personal Data | Global |
RSA-2048 Private Key (ML) | Cyrptographical | Global |
Sexual Orientation (ML) | Personal Data | Global |
URL (ML) | Personal Data | Global |
Category and Subcategories Classification
Documents are classified into following categories and subcategories. These categories and subcategories are independent of each other so that each subcategory can be associated with any category.
Category
Finance
Healthcare
Legal
HR
Subcategory
RecordsManagement
CertificateOfEmployment
IndemnityAgreement
SalaryStructure
BusinessLicense
OfferLetter
Diversity
OrganizationChart
DressCode
HIPAA
DoNotResuscitateForm
PerformanceImprovementPlan
ArticlesOfOrganization
TimeOff
BusinessBalanceSheet
Immunization
OperativeReport
DataPrivacy
ConflictOfInterest
BusinessInsurance
Will
PerformanceAppraisal
BusinessAccountPayable
Telecommuting
MissionVisionAndStrategy
MissedAppointmentPolicy
MeetingMinutes
Recruitment
ShareHolderAgreement
LiabilityWaiver
BusinessPayroll
Harassment
EmploymentContract
Supported formats for standard agent
Files
Type | Format |
---|---|
Compressed | bzip2, Gzip (all types), TAR, Zip (all types) |
Databases | Access, DBase, SQLite, MSSQL MDF & LDF |
Images | BMP, FAX, GIF, JPG, PDF (embedded), PNG, TIF |
Microsoft Backup Archive | Microsoft Binary / BKF |
Microsoft Office | v5, 6, 95, 97, 2000, XP, 2003 onwards |
Open Source | Star Office / Open Office / Libre Office |
Open Standards | PDF, RTF, HTML, XML, CSV, TXT |
Office files
WORD
Legacy: Legacy filename extensions denote binary Microsoft Word formatting that became outdated with the release of Microsoft Office 2007. Although the latest version of Microsoft Word can still open them, they are no longer developed. Legacy filename extensions include:
.doc – Legacy Word document; Microsoft Office refers to them as "Microsoft Word 97 – 2003 Document"
.dot – Legacy Word templates; officially designated "Microsoft Word 97 – 2003 Template"
.wbk – Legacy Word document backup; referred as "Microsoft Word Backup Document"
OOXML: Office Open XML (OOXML) format was introduced with Microsoft Office 2007 and became the default format of Microsoft Word ever since. Pertaining file extensions include:
.docx – Word document
.docm – Word macro-enabled document; same as docx, but may contain macros and scripts
.dotx – Word template
.dotm – Word macro-enabled template; same as dotx, but may contain macros and scripts
.docb – Word binary document introduced in Microsoft Office 2007
EXCEL
Legacy: Legacy filename extensions denote binary Microsoft Excel formats that became outdated with the release of Microsoft Office 2007. Although the latest version of Microsoft Excel can still open them, they are no longer developed. Legacy filename extensions include:
.xls – Legacy Excel worksheets; officially designated "Microsoft Excel 97-2003 Worksheet"
.xlt – Legacy Excel templates; officially designated "Microsoft Excel 97-2003 Template"
.xlm – Legacy Excel macro
OOXML: Office Open XML (OOXML) format was introduced with Microsoft Office 2007 and became the default format of Microsoft Excel ever since. Excel-related file extensions of this format include:
.xlsx – Excel workbook
.xlsm – Excel macro-enabled workbook; same as xlsx but may contain macros and scripts
.xltx – Excel template
.xltm – Excel macro-enabled template; same as xltx but may contain macros and scripts
POWERPOINT
Legacy:
.ppt – Legacy PowerPoint presentation
.pot – Legacy PowerPoint template
.pps – Legacy PowerPoint slideshow
OOXML:
.pptx – PowerPoint presentation
.pptm – PowerPoint macro-enabled presentation
.potx – PowerPoint template
.potm – PowerPoint macro-enabled template
.ppam – PowerPoint add-in
.ppsx – PowerPoint slideshow
.ppsm – PowerPoint macro-enabled slideshow
.sldx – PowerPoint slide
.sldm – PowerPoint macro-enabled slide
ACCESS
Legacy:
.ade – Protected Access Data Project (not supported in 2013)
.adp - Access Data Project (not supported in 2013)
.mdb - Access Database (2003 and earlier)
.cdb - Access Database (Pocket Access for Windows CE)
.mda - Access Database, used for addins (Access 2, 95, 97), previously used for workgroups (Access 2)
.mdt - Access Add-in Data (2003 and earlier)
.mdf - Access (SQL Server) detached database (2000)
.mde - Protected Access Database, with compiled VBA and macros (2003 and earlier)
.ldb - Access lock files (associated with .mdb)
Available formats since Access 2007:
.accdb – The file extension for the new Office Access 2007 file format. This takes the place of the MDB file extension
.accde – The file extension for Office Access 2007 files that are in "execute only" mode. ACCDE files have all Visual Basic for Applications (VBA) source code hidden. A user of an ACCDE file can only execute VBA code, but not view or modify it. ACCDE takes the place of the MDE file extension
.accdt – The file extension for Access Database Templates
.accdr – is a new file extension that enables you to open a database in runtime mode. By simply changing a database's file extension from .accdb to .accdr, you can create a "locked-down" version of your Office Access database. You can change the file extension back to .accdb to restore full functionality
OUTLOOK
.pst - Outlook
.ost - Outlook
.msg - Outlook
.dbx - Outlook
OTHER
.pub – a Microsoft Publisher publication
.xps – a XML-based document format used for printing (on Windows Vista and later) and preserving documents
Databases
Microsoft SQL
Oracle
IBM DB2
PostgresQL
SAP HANA
MySQL
MongoDB
Big Data
Hadoop
Teradata
Binary Large Objects
Database | Object Type |
---|---|
Oracle | BLOB, CLOB |
Microsoft SQL Server | VARBINARY, Filestream |
PostgreSQL | bytea, text, Large Objects(oid) |
MySQL | BLOB, TINYBLOB, MEDIUMBLOB, LONGBLOB |
IBM DB2 | BLOBs, CLOB, DBCLOB |
Teradata | BLOBs |
MongoDB | GridFS |
SAP HANA | BLOB, NCLOB |
Supported file formats for DDC ML
Unstructured
DOCX
PDF
RTF
TXT
ZIP (contains the supported file format)
Semi-structured
HTML
DOCX (Key-value pairs)
PDF (Key-value pairs)
RTF (Key-value pairs)
TXT (Key-value pairs)
ZIP (Key-value pairs)
Unsupported file formats for DDC ML
All structured file formats including CSV, XLSX, and database files (RDBMS).
Semi–structured file formats such as JSON, XML, and YAML.
Unstructured file formats such as PPT, PPTX, HEIC, and MPS.
All image, audio, and video file formats.
All formats not included in supported list.
Files with bad or inappropriate context.
Configuration backup
You can back up and restore the DDC configuration by using the Backup/Restore functionality available in CipherTrust Manager. Such a backup will include the following elements:
Data Stores
Locations
Classification Profiles
Infotypes
Report definitions
This backup will not include the information about the scan executions.
Creating/Restoring the configuration backup
To create or restore a backup of your DDC configuration:
Log in to CipherTrust Manager.
Click the Admin Settings link on the dashboard.
Select Backups from the sidebar on the left. This will display the Backups screen.
To create a backup of your DDC configuration, click the Create Backup button.
To restore your DDC configuration from a backup, click the Upload Backup button.
For more details refer to these sections of the CipherTrust Manager documentation:
Configuration backup limitations
The configuration backup references the DDC Active Node. Restoring the backup to a different CipherTrust Manager cluster leaves DDC referencing an invalid node, and therefore without any valid active node.
The configuration backup contains the definition of the DDC resources (such as the Scan or Data Store definitions). Restoring from a backup that does not contain a certain resource (for example, a Custom Classification Profile) or a resource version after a scan had been completed causes a TDP scan execution data to point to an invalid resource identifier.
If you generate a report that points to the missing resource you may display incomplete data (such as not being able to display the resource name) and/or fail.
Creating/Restoring backup of scan executions
To back up or restore the your Data Discovery and Classification scan executions data you need to access the DDC data stored in Hadoop. For details, refer to the Thales Data Platform Hadoop backup section in the Thales Data Platform Administrator Guide.
Mounting an NFS share
To mount an NFS share on a Proxy agent, run this command as root:
sudo mount <nfs-server-hostname|nfs-server-ipaddress>:</target/directory/share-name>
Issues encountered while browsing target paths
You may encounter following issues while trying to navigate target paths using Browse button.
Datastore | Scenario | Issue |
---|---|---|
AWS | Path inside Add Target Path field contains invalid folder name within valid Bucket name Example: <valid_bucket>/<invalid_folder> | No error toast message shows up. File browser displays "No paths to display". |
GMail | Path inside Add Target Path field contains invalid folder name within valid user account Example: <valid_useraccount>/<invalid_folder> | No error toast message shows up. File browser displays "No paths to display". |
GMail, Google Drive | Add Target Path field is empty or contains a path to a valid folder | User accounts are listed correctly, but targets within it are not displayed. |
NFS | Add Target Path field contains invalid path | No error toast message shows up. File browser displays "No paths to display". |
SharePoint Online, SharePoint Server | Add Target Path field contains: • Invalid folder in valid site collection • Invalid file within valid site collection • Invalid file with valid folder in a site collection Examples: <valid_site_collection>/:site/:file/<valid_folder>/<invalid_file> <valid_site_collection>/:site/:file/<invalid_folder> <valid_site_collection>/:site/:file/<invalid_file> | No error toast message shows up. File browser displays "No paths to display". |
SharePoint Online, SharePoint Server | Add Target Path field contains incorrect site collection | Arbritary folders and files may get displayed without any error. Selecting these results as target leads to a failed scan. |
Exchange Online | Add Target Path field is empty | System default groups are listed. Upon selecting one or more default groups, the scan is executed successfully without any macthes. System default groups cannot undergo scanning. Targets must be valid email or email groups. |
API request quota limit
Datastore | Quota Limit (per minute request count) |
---|---|
Gmail | 15000 |
SharePoint Online | 600 |
Exchange Online | 800 |
OneDrive | 1000 |
Using wildcard characters
Asterisk (*) and question mark (?) are two most popular wildcard characters that you can use to traverse locations without providing absolute paths.
In DDC, you can use these characters while applying scan filters to scan selective locations instead of recursively scanning entire directory structures of data store.
Wildcard | Meaning | Examples |
---|---|---|
* | Matches zero or more characters | D:/A/* will traverse all locations starting with D:/A D:/A/*/F/E will traverse all locations starting with D:/A and ending with /F/E */A/E will traverse all locations ending with /A/E *A/E* will traverse all locations that contains A/E D:/A/*/B/C/* will traverse all locations that starts with D:/A/ and contains /B/C/ in the followed path. |
? | Matches exactly one character | D:/A??? will traverse all locations starting with D:/A followed by any three characters. |