Solution Architecture
This section describes the main components of the Thales CipherTrust Data Discovery and Classification (DDC) solution. The concepts used in this diagram are briefly discussed in this section and explained at length in the later sections of this document.
CipherTrust, DDC Server
At the heart of the Data Discovery and Classification solution is CipherTrust Manager on which runs the DDC Server. It is from here that users interact with the DDC GUI or use the DDC APIs to create classification profiles, add data stores, launch scans and generate reports.
REST APIs, GUI
Various types of interfaces used to interact with DDC.
Hadoop, Spark, HDFS
DDC uses Hadoop to generate reports from scans and to store their results (report data). DDC can directly query HDFS but it requires Spark to interface with Hadoop's HBase.
DDC Agents
DDC Agents perform the actual scanning jobs and report the results back to the DDC Server for analysis and processing. DDC supports two types of Agent configurations: Local Agents are installed and configured directly on the machine that contains sensitive data; Proxy Agents are installed and configured on a proxy machine that is used to scan sensitive data on other machines.
Data Store
A data store is where the data actually resides. It can be a file server, a database, or a Hadoop cluster. For more information see Managing Data Stores.
Local storage
A type of a data store, a file system (Windows or Linux) that is localized to the same machine where the Agent scanning it is installed.
NFS, CIFS
A type of data store, a network share (Windows or Linux) that resides on a different machine than that where the Agent scanning it is installed.