Cloud Data Stores
DDC supports these types of Cloud storages as data stores:
AWS S3 - AWS (Amazon Web Services) is an on-demand cloud computing platform and API.
Azure Blobs - Microsoft Azure Blobs (used to store unstructured text and binary data).
Azure Table - lets programs store structured text in partitioned collections of entities that are accessed by partition key and primary key.
Office 365 Sharepoint Online - Sharepoint Online is a document management and storage system delivered as part of Microsoft Online Services suite.
Office 365 Exchange Online - Exchange Online is Exchange Server delivered as a cloud service hosted by Microsoft.
Office 365 OneDrive for Business - OneDrive for Business is a managed cloud storage for business users that replaces SharePoint Workspace.
G-Suite (G-Mail and G-Drive)
Salesforce - Marketing Cloud offers marketing automation and analytics software for email, mobile, social and online marketing.
Note
Before adding any Cloud data store, make sure that you have the required user credentials handy.
Adding Cloud Data Stores
Use the Add Data Store wizard to add a big data type data store. Adding a Big Data data store involves the following steps:
1. Select Store Type
In the Select Store Type screen of the wizard select Cloud in the Select Data Store Category.
From the Select Database Type drop-down list select:
AWS S3
Azure Blobs
Azure Table
Office 365: Sharepoint Online
Office 365: Exchange Online
Office 365: OneDrive for Business
G-Suite
Salesforce
Click Next to go on to the Configure Connection screen.
2. Configure Connection
In the Configure Connection screen of the wizard, provide the following configuration details for your data store:
AWS S3
Provide the user security credentials, which consist of an Access Key ID and a Secret Access Key.
Access Key ID: Enter the Access Key ID that you obtained from your storage account administrator. For example:
AKIAABCDEFGHIEXAMPLE
Secret Access Key: Enter the Secret Access Key as obtained from your storage account administrator. For example:
aBcDeFGHiJKLM/A1NOPQR/wxYzdcbAEXAMPLEKEYd
Select the Show Secret Access Key checkbox if you want to view the secret access key.
To set up an Amazon S3 as a target for a scan, use the following format:
Whole Bucket - <BucketName>
Specific folder in a Bucket - <BucketName/folder>
Specific file in a Bucket - <BucketName[/folder]/file.txt>
Note
Each Amazon S3 Bucket included in a scan consumes one Amazon S3 Bucket license. Make sure to use credentials that have access to all Amazon S3 Buckets that are selected for a scan to avoid consuming licenses for inaccessible Buckets.
For full details of the configuration of the AWS S3 data store, refer to AWS S3.
AZURE BLOBS
In the Configure Connection step, provide the following information:
Account Name: The name of your Azure Storage account.
User: The name of your Azure Storage account.
Active Access Key: Enter key1 or key2, which is your primary or secondary Azure account access key. If you do not know what they are, follow the steps in Obtaining the Azure Account Access Keys.
Tip
You should ask your Azure Storage account administrator which access key is currently active, since only one access key can be active at a time.
AZURE TABLE
Account Name: Enter your Azure account name.
User: Enter your Azure Storage account name.
Password: Your Azure password.
OFFICE 365: SHAREPOINT ONLINE
Domain: Enter your SharePoint Online organization name. For example, if you access SharePoint Online at https://mycompany.sharepoint.com, enter mycompany.
Client ID - Enter the Client ID for the registered SharePoint Add-in. Example:
1234abcd-56ef-78gh-90ij-1234clientid
You can generate the Client ID and Client Secret Key when you register the SharePoint Add-in. You need to note down these values after you generate the SharePoint Add-in.
Client Secret Key - Enter the Client Secret key for the registered SharePoint Add-in. Example:
abcdefghij0123456789klmnopqrst0clientsecret
Tenant ID - Enter the Tenant ID key for the registered SharePoint Add-in. Example:
12345678-abcd-9012-efgh-ijkltenantid
You can get it when you go to the tenant administration site (for example, at https://mycompany-admin.sharepoint.com/_layouts/15/appinv.aspx) using your administrator account to grant permissions to the registered SharePoint Add-in. The Tenant Id can be obtained from the App Identifier value, which has the following format:
i:0i.t|ms.sp.ext|<client ID>@<tenant ID>
. In this example,i:0i.t|ms.sp.ext|1234abcd-56ef-78gh-90ij-1234clientid@12345678-abcd-9012-efgh-ijkltenantid
, the Tenant ID is12345678-abcd-9012-efgh-ijkltenantid
.
For more information on these configuration parameters, refer to the Sharepoint Online documentation.
OFFICE 365: EXCHANGE ONLINE
Exchange Online Domain: Enter a domain to scan mailboxes that reside on that domain. This is usually the domain component of the email address, or the Windows Domain.
Note
Ensure that the domain name is correct. DDC can't scan any files if the specified domain name doesn't exist.
Client ID: Enter your Exchange Online client ID (application ID).
Client Secret Key: Enter your Exchange Online client secret key. Select the Show Client Secret Key check-box to view the key.
Tenant ID: Enter your Office 365: Exchange Online tenant ID. Your Microsoft 365 tenant ID is a globally unique identifier (GUID) that is different than your organization name or domain.
For full details of the configuration of the Office365: Exchange Online data store, refer to Office365: Exchange Online.
OFFICE 365: ONEDRIVE FOR BUSINESS
OneDrive for Business Domain - Enter the Microsoft 365 domain. Example:
example.onmicrosoft.com
Warning
An Office365: OneDrive for Business data store will get created successfully even with a wrong domain. This is a known issue.
Client ID - Enter the Client ID. Example:
clientid-1234-5678-abcd-6d05bf28c2bf
You generate the Client ID and Tenant ID in the Azure app registration portal. After you register your application you can view the Client ID and Tenant ID Key. You need to note down these values.
Client Secret Key - Enter the Client Secret key. Example:
client~secret.key-CHvV1B5YQfr~6zDjEyv
It is the Client Secret that you set in the Azure app registration portal in the Certificates & Secrets page. Make sure that you save your Client Secret key in a secure location as you will not be able to retrieve it later.
Tenant ID - Enter the Tenant ID. Example:
tenantid-1234-abcd-5678-02011df316f4
For more information on these configuration parameters, refer to the OneDrive for Business Online documentation.
G-SUITE
Domain: The G-Suite domain that you want to scan in the G Suite Domain field. For example, if your G-Suite administrator email is admin@example.com, your G-Suite domain is example.com.
Admin User: The G-Suite administrator account email address. Use the same administrator account used to Enable APIs and Set up Domain-Wide Delegation.
Service Account: Your Service account ID, for example, ddc-service-account@vertical-tuner-322508.iam.gserviceaccount.com.
IP12 Key: Upload the P12 key associated with your Service account ID.
For full details of the configuration of the G-Suite data store, refer to G-Suite.
SALESFORCE
Account Name: Salesforce Account. Use the correct syntax for the Salesforce Account according to Salesforce site.
Production
Syntax: <email_address>
Example: admin@example.com
Sandbox
Syntax: sandbox:<email_address>
Example: sandbox:admin@example.comConsumer Key: Enter the Consumer Key obtained in Creating Connected App. For example:
9tzQREbH3MVG_SvurP17mjK2py_jS6lfqit1_ss50PkRmNIZnd7yM92zOBnU3IQPvSyu5PQIV2dsqyQiw0T5
Select the Show Consumer Key check-box to view the key.
Private Key: Use the Browse Private Key button to upload the private key file obtained from Generating Certificate and Private Key. For example, er-salesforce.key.
The Agent Selection section allows you to specify the minimum and maximum number of proxy agents when adding a datastore. Employing a group of agents instead of a single agent to run the scan should improve the scan execution time.
Note
The multiple agent functionality is not supported for the Office 365 Sharepoint Online datastore.
In the Select Number of Agents menu set the number of agents for the datastore:
Minimum: - Set the minimun number of agents to use to scan the datastore. At least that number of proxy agents must be able to connect to the datastore.
Maximum: - Set the maximum number agents to use to scan the datastore.
Warning
• As there is no limit on the number of minimum and maximum agents that you can set, you should exercise caution so that you do not impact the system performance by using too many resouces for a single scan.
• You will not be able to add a datastore if the minimum number of agents cannot be assigned.
• A scan will fail if the assigned agent is unavailable after adding the datastore.
• The minimum number of agents must be less than or equal to the maximum number of agents.In the Add Label: field, add an agent label, by entering a label or removing and existing label. Agent labels represent the agent capabilities.
Click Next to move on to the General Info step of the wizard.
3. General Info
In the General Info screen of the wizard, specify the name, description, branch location, and sensitivity level for your data store. See "Configuring a Data Store - General Information" for details.
Configure the General Info part per the information in General Info.
Click Next to go to the Add Tags & Access Control screen.
4. Add Tags & Access Control
In the Add Tags & Access Control screen of the wizard, grant access rights to your data store and add metadata. See "Configuring a Data Store – Tags and Access Control" for details.
Configure the Tags & Access Control par per the information in Tags & Access Control.
Click Save. The newly created data store appears on the Data Stores page. By default, data stores are displayed in alphabetic order by name. Depending on the number of entries per page, you might need to navigate to other pages to view the newly created data store.
At any time during the configuration you can click Back to go to any of the previous wizard screens to update the configuration.
The newly created data store appears on the Data Stores page. By default, data stores are displayed in alphabetic order by name. Depending on the number of entries per page, you might need to navigate to other pages to view the newly created data store.
Recommended Least Privilege User Approach:
Note
To reduce the risk of data loss or privileged account abuse, the Target credentials provided for the intended Target should only be granted read-only access to the exact resources and data that require scanning. Never grant full user access privileges or unrestricted data access to any application if it is not required.
Click Save to create the data store. At any time during the configuration you can click Back to go to any of the previous wizard screens to update the configuration.
The newly created data store appears on the Data Stores page. By default, data stores are displayed in alphabetic order by name. Depending on the number of entries per page, you might need to navigate to other pages to view the newly created data store.
Obtaining the Azure Account Access Keys
If you need to find out what your Azure account access keys are:
Log into your Azure account.
Navigate to All resources > [Storage account].
Click Access keys under Settings.
Note down the key1 (primary) and key2 (secondary).
The primary and secondary access keys are used to make rolling key changes. Only one access key can be active at a time. Ask your Azure Storage account administrator which access key is currently active, and use that key to connect DDC to your Azure Storage account.
Configuring a Salesforce Account
To be able to use Salesforce Targets as a data store you will need to generate a certificate and a private key and create a connected app.
Note
The instructions provided in this section are specific to the Salesforce Lightning interface. If you are using Salesforce Classic, you will notice a different interface, in which case you should refer to the Salesforce Classic interface documentation to complete the prerequisites.
Generating Certificate and Private Key
To generate the digital certificate and private key:
Using the Terminal or Windows Command Prompt, install the OpenSSL package and run the following command:
# Syntax: openssl req -x509 -sha256 -nodes -newkey rsa:2048 -days <number of days> -keyout <*.key private key file> -out <*.crt certificate file> openssl req -x509 -sha256 -nodes -newkey rsa:2048 -days 365 -keyout ddc-salesforce.key -out ddc-salesforce.crt
where:
days (optional) - Number of days to certify the certificate for. The default is 30 days.
keyout - Output filename to write the private key to. For example, ddc-salesforce.key.
out - Output filename to write the digital certificate to. For example, ddc-salesforce.crt.
You will need to provide for the following information for openssl:
Country Name (2 letter code) [AU]: Your country's two letter country code (ISO 3166-1 alpha-2).
State or Province Name (full name) [Some-State]: State or province name.
Locality Name (e.g., city) []: City name or name of region.
Organization Name (e.g., company) [Internet Widgits Pty Ltd]: Name of organization.
Organizational Unit Name (e.g., section) []: Name of organizational department.
Common Name (e.g. server FQDN or YOUR name) []: Fully qualified domain name of the Master Server.
Email Address []: Email address of organization's contact person.
The openssl command generates the digital certificate (for example, ddc-salesforce.crt) required to create a connected app for DDC, and the private key (for example, ddc-salesforce.key) required to set up and scan a Salesforce data store.
Creating Connected App
As the administrator, log in to your organization's Salesforce site and go to Setup.
In the Setup > Home tab, enter "App Manager" in the Quick Find box, and select App Manager.
In the Lightning Experience App Manager page, click on New Connected App.
In the Basic Information section, fill in the following fields:
Connected App Name - Enter a descriptive display name for DDC. For example, Data_Discovery_and_Classification.
API Name - Enter a unique identifier to use when referring to the app programmatically. For example, DDC.
Contact Email - Enter an email address that Salesforce can use if they need to contact you about the connected app.
In the API (Enable OAuth Settings) section, select the Enable OAuth Settings checkbox.
In the Callback URL field, enter the URL to redirect to after successful authorization of the connected app. For example, https://example.com/callback-ddc.
Note
The Callback URL is mandatory when setting up a connected app, but is not required for scanning Salesforce Targets as a data store.
Select the Use digital signatures checkbox and click Choose File to upload a digital certificate. For example, ddc-salesforce.crt.
Under Select OAuth Scopes, select and add the following permissions for the "DDC" connected app:
Access the identity URL service (id, profile, email, address, phone)
Manage user data via APIs (api)
Perform requests at any time (refresh_token, offline_access)
Required for probing, scanning and remediating Salesforce Targets.
Click Save and Continue.
In the Manage Connected Apps page, go to API (Enable OAuth Settings) > Consumer Key and click Copy.
The consumer key will be required when you Set Up and Scan a Salesforce Target.
Tip
A consumer key is generated automatically when you create a connected app in Salesforce. Make sure that you store it when you set it, as it is unique and once set, you will not be able to edit or overwrite it.
Click Manage > Edit Policies.
Under OAuth Policies > Permitted Users, select Admin approved users are pre-authorized.
Click Save.
Back in the App Manager page, go to the Profiles section and click Manage Profiles.
In the Application Profile Assignment page, select the profile(s) (e.g. "System Administrator") that you want to allow to access the "DDC" connected app.
Note
The Salesforce account that is specified when you Set Up and Scan a Salesforce Target must be assigned to at least one of the profiles that has:
• Access to the DDC connected app (e.g. "Enterprise_Recon"), and
• Minimum "Read" permissions for the Salesforce Objects to be scanned.
See Salesforce Help - Object Permissions for more information.Click Save.
In the Setup > Home tab, enter "Profiles" in the Quick Find box, and select Profiles.
Go to the profile(s) selected in Step 15 (e.g. "System Administrator") and click Edit.
In the Administrative Permissions section, select the following checkboxes:
- API Enabled
- Query All Files
Note
Enabling the Query All Files permission is an optional step that allows the Salesforce account that is specified when you set up and scan a Salesforce data source to scan all files in your organization's Salesforce site, including those owned / managed by other user accounts.
Without the Query All Files permission, DDC will only be able to scan the files that are owned by / shared to the specified Salesforce account.
For more information about Salesforce behavior when "Query All Files" is enabled, please refer to https://developer.salesforce.com/docs/atlas.en-us.234.0.object_reference.meta/object_reference/sforce_api_objects_contentversion.htm.Click Save.
Configuring the SharePoint Add-In
For the SharePoint Add-In to work, you need to set to false the DisableCustomAuthenticationApp setting for the tenants. In the PowerShell, execute the following commands:
Install-Module -Name Microsoft.Online.SharePoint.PowerShell
$adminUPN="<full email address of a SharePoint administrator account>”
Example: $adminUPN=example@democompany.onmicrosoft.com
$orgName="<name of your Office 365 organization>”
Example: $orgName=" democompany”
$userCredential = Get-Credential -UserName $adminUPN -Message "<password>"
Example: $userCredential = Get-Credential -UserName $adminUPN -Message "demopassword@123"
Connect-SPOService -Url https://$<orgName>-admin.sharepoint.com -Credential $userCredential
Example: Connect-SPOService -Url https://$[democompany-admin.sharepoint.com|http://democompany-admin.sharepoint.com/] -Credential $userCredential
Note
The last command may throw the following error:
Connect-SPOService : Identity Client Runtime Library (IDCRL) did not get a response from the Login server.
At line:1 char:1
+ Connect-SPOService -Url https://trial8349-admin.sharepoint.com -Crede ...
+ CategoryInfo : NotSpecified: ([Connect-SPOService], IdcrlException
+ FullyQualifiedErrorId : Microsoft.SharePoint.Client.IdcrlException,Microsoft.Online.SharePoint.PowerShell.ConnectSPOService
This is because of multi-factor authentication is enabled. In this case, follow these steps:
a. Remove the -Credential $userCredential part from the command and again execute it.
b. It will then prompt for the Office 365 authentication.set-spotenant -DisableCustomAppAuthentication $false