DDC Overview
This section provides an overview of the Thales CipherTrust Data Discovery and Classification (DDC) solution.
Workflow
This following diagram provides a high-level flow of the DDC solution.
Solution Architecture
The concepts used in this diagram are briefly discussed in this section and explained at length in the later sections of this document.
Architecture Components
The following table outlines different architecture components along with their respective roles:
Component | Description |
---|---|
CipherTrust Manager, DDC Server | At the heart of the Data Discovery and Classification solution is CipherTrust Manager on which runs the DDC Server. It is from here that users interact with the DDC GUI or use the DDC APIs to create classification profiles, add data stores, launch scans and generate reports. |
TDP (On-prem): Hadoop, Spark, HDFS | TDP (On-prem) is configured to work with Hadoop data clusters. DDC uses Hadoop to generate reports from scans and to store their results (report data). DDC can directly query HDFS but it requires Spark to interface with Hadoop's HBase. For installing TDP (On-prem), see Thales Data Platform Deployment Guide. |
TDPaaS | TDPaaS is a server-less, cloud-based service used for storing the scan and report data. It is a SaaS component that offers an alternative to Hadoop services of on-prem TDP. For configuring DDC to use TDPaaS, see Configuring TDPaaS. |
DDC Agents | DDC Agents perform the actual scanning jobs and report the results back to the DDC Server for analysis and processing. DDC supports two types of Agent configurations: • Local Agents: Installed and configured directly on the machine that contains sensitive data. • Proxy Agents: Installed and configured on a proxy machine that is used to scan sensitive data on other machines. |
Data Stores | A data store is where the data actually resides. It can be a file server, a database, or a Hadoop cluster. For more information see Discovering Sensitive Information. |