Reports
You manage reports through the Reports page, which is accessed by clicking the Reports link in the Data Discovery sidebar on the left.
From the Reports page you can:
View all existing reports. See Viewing Reports.
View versions of a report. See Viewing Versions of a Report.
Create a new report. See Creating Reports.
Generate a report. See Generating Reports.
View details of a selected report. See Report Details.
Remove a report. See Removing Reports.
Export the data objects associated with a report. See Exporting Report's Data Objects.
Export the inaccessible data objects associated with a report. See Exporting Report's Inaccessible Data Objects.
Viewing Reports
The Reports page displayed lists available reports. Initially, the page shows no reports. Newly configured reports are shown on this page. Additionally, the page shows the total number of available reports.
By default, reports are listed in ascending alphabetic order of their names. The list view of the Reports page shows the following details:
Item | Description |
---|---|
Report Name | Name of the report. |
Status | Status of the report. The status could be Not Generated, In Progress, Completed, or Failed. |
Type | Type of the report (Aggregated or Trend). |
Last Run | Time when the report was run. |
Duration | The duration time of the report run. |
Re-generation | Whether to automatically trigger report regeneration on subsequent runs of the linked scans. Select to enable automatic report regeneration, clear to disable. |
[cogwheel] | Clicking on the cogwheel button in the header of the last column allows you to select which columns to display. |
The overflow icon () in each row allows you to select View, Generate, Export D.O., or Remove for the report.
Use the Search text box to filter reports. Search results display reports that contain specified text in their names.
Reports can be sorted by their name (Name), and the last time that the scan was run (Last Run).
Viewing Versions of a Report
DDC supports regeneration of reports. As a result, multiple versions of reports can be generated. The details view of the Reports page shows available versions of a report.
To view details of a report version:
Click the overflow icon () corresponding to the desired report.
Click View in the contextual menu that is displayed. The details of the report are displayed.
At the top of the detail view, click the expand icon () to expand the versions list. The available versions of the report is displayed.
Click a version link to view its details.
The details view shows the details of the selected report version. To switch to a different version, click its link in the detail view under the expand icon ().
Report Types
There are two different ways of generating reports:
A report that is based on a scan run on a specific date. Such a report shows the scan findings on the selected date. In the case of multiple executions of the scan for the selected day, the report will include the information for the latest execution on that day.
A report that always shows the information found on the latest scan execution. This way, the results will reflect (or update) the changes in the sensitive data discovered if the underlying data in the Data Store or the Classification Profile is modified. Still, the report will actually get generated when the user chooses to generate it.
Note
In order to see an updated report it must be generated again (manually) to see the results of the last scan execution.
Creating Reports
To create a report you aggregate data from multiple sources. When a report is generated it contains the results of executed scans.
To create a report use the New Report wizard described in the following sections. To launch the wizard, click the + Add Report button in the Reports page on the right.
General Info
Provide the following information in the General Info screen of the New Report wizard:
A unique Name for the report. The name must be longer than two characters and up to 64 characters. This field is mandatory.
An optional Description for the report (up to 250 characters).
Type of the report, Aggregated or Trend.
(Applicable to Aggregated reports) Whether to trigger report regeneration automatically for each new scan added as the latest execution.
Select Enable Report Re-generation to trigger automatic report regeneration. The report will be regenerated automatically whenever any scan linked with the report is executed subsequently (that is, when Latest Execution time is changed).
Click Next to go on to the Select Scan Executions step of the wizard.
Select Scan Executions
The Select Scan Executions screen of the wizard shows available scans with their number and the number of selected scans. Content identified in this step will be merged in a single report. This is called an Aggregated report.
Tip
To include the removed scans in the list leave the Show Removed Scans check box selected.
Use the Search text box to filter available scans. Search results display scans that contain specified text in their names.
Select the Scan Name check boxes corresponding to desired scans.
When you are done, switch on the Selected Only toggle to list only the scans that you selected.
You can create two different types of reports:
A report that is based on a scan run on a specific date. For such a report, click "Latest Execution" and select the date of the scan that you wish to use.
For Aggregated reports that have Enable Report Re-generation selected:
If a previous scan run date is selected for the report, the report will not be automatically regenerated.
If the current date is selected, the report will be automatically regenerated for subsequent scan runs on the current day. However, after the date changes, any new scan runs won't trigger report regeneration.
A report that can change if the underlying data store or protection profile is modified and a scan is run again. For such a report leave "Scan Execution" as "Latest Execution".
See Report Types for more information on these two types of reports.
Click Save and a message will appear stating that the report has been created successfully.
To generate the report immediately after creating it, select the Generate Now check box before you click Save.
Generating Reports
After you have configured a report, it can be generated at anytime. Configured reports can be generated any number of times.
To generate a report:
In the Reports page, search for the report that you want to generate.
Tip
Use the Search text box to filter reports. Search results display reports that contain specified text in their names.
By default, reports are listed in ascending alphabetic order of their names.
Reports can be sorted by their name (Name), and the last time that the scan was run (Last Run).
Click the overflow icon () corresponding to the desired report. A shortcut menu appears.
Click Generate.
As soon as the report starts to run, its status becomes Pending. The status of the report changes in the sequence: Not Generated > In Progress > Completed / Failed.
Note
Permissions to access the data stores accessed by the scans included in a scan-based report are checked every time the report is run. If the current user no longer has the correct permission for any of them, an error is displayed.
Report Details
The report details page displays such information about the report as the report name, the number of scans, data stores, data objects, and unscanned or partially scanned data objects. The page also shows the total data objects scanned, sensitive data objects found, sensitive data matches, and selected info types found.
The upper part of the report details page displays general information about the report, such as the name of the report and the number of scans included in the report.
The Print Preview button in the top right corner of the screen allows you to print the report. For more information, see Printing the Report Details.
The findings of the scan (or scans) is reported on these four cards:
TOTAL DATA OBJECTS SCANNED - All data data objects that were included in the scan.
Note
When scanning Exchange Online and Exchange Server notes data objects, DDC counts notes and their associated attachments as separate data objects.
TOTAL DATA OBJECTS SCANNED count is equal to the sum of number of notes and number of attachments linked with each note in the corresponding target path. For example, if you scan a target path containing 4 notes, one of which includes an image attachment, the TOTAL DATA OBJECTS SCANNED count will be 5.
Due to a known limitation, in reports for MongoDB, Azure Table, Salesforce, and G-Mail data stores, you will see "N/A" for the TOTAL DATA OBJECTS SCANNED values.
For G-Mail, DDC ignores copies of the emails that were received in the same second, from the same sender, and with the same subject.
In case of large binary objects (BLOB), TOTAL DATA OBJECTS SCANNED reports the parent and child data objects in a given blob.
SENSITIVE DATA OBJECTS FOUND - All data objects that contain sensitive data that were identified by the scan.
SENSITIVE DATA MATCHES - All sensitive pieces of information that were found inside the sensitive data objects.
SELECTED INFOTYPES FOUND - The infotypes found during the scan out of those configured for the scan when it was created (" out of ").
The lower part of the report details screen displays the report information in the following three tabs.
Scans Tab
This tab provides the list of scans that contributed the information for the report. Additionally, you can see the report data in graphical format containing following information.
Infotypes Discovered
Sensitive Data Objects by Content
Sensitive Infotype Distribution
Sensitive Data Objects
Data Stores Tab
This tab provides the list of data stores included in the report, with the information about their risk score, sensitivity level, scan name, last scan time, targets, infotypes, data objects scanned, and sensitive objects found in each data store that was scanned.
Data Objects Tab
This tab provides the list of data objects scanned. Only the top 1000 data objects are searched and sorted by Risk score (the first 25 of these are displayed but you can view more by clicking "Show more"). Use the Search box to search for a specific data object.
The data object search has following characteristics:
Case-sensitivity is applicable
Partial search is conducted. For example, searching for "bal" will include "baseball" and "balloon" in the search results.
Search is performed for the Object Name and Path
Note
When searching for the first time, the result will take some time (usually between 1 and 3 minutes) to display. During this time you can navigate through the different tabs inside the report, however, if you leave the Reports space you must repeat the search. The results of a successfully completed search are cached and will allow repeating the same search with a response time of a few seconds only. Generating the report after the search is completed will invalidate the cache as it will render the information outdated.
The table in the report details lists the following findings distributed among columns:
Column Name | Description |
---|---|
Data Object Name | The name of the data object scanned and listed in the report details. For Oracle and IBM DB2 the result will be displayed in uppercase. |
Risk | The number of risks found by the report in the given scanned data object. A risk is the presence of a sensitive item of data. |
Type | The type of the scanned data object listed in the report details, such as "File" or "Folder". |
Path | The path to the object that is listed in the report details. |
Data Store | The name of the data store where the object listed in the report details was found. |
Infotypes | The number of information types found in the data object that is listed in the report. |
Note
Details of a data object scanned partially due to any issue are displayed in the Inaccessible Data Objects tab.
A scanned ZIP file is displayed under the Data Objects tab. However, this tab doesn't indicate whether all the files included in the ZIP file are scanned successfully. To confirm this, review the Inaccessible Data Objects tab.
Inaccessible Data Objects Tab
This tab is visible for Aggregated reports. This tab shows the list of inaccessible, skipped, or partially scanned data objects.
The table in the report detail lists the following findings distributed among columns:
Column Name | Description |
---|---|
Data Object Name | The name of the data object scanned and listed in the report details. |
Data Store | The name of the data store where the object listed in the report details was found. |
Path | The path to the object that is listed in the report details. |
Severity | The severity of why the listed data object is inaccessible, skipped, or partially scanned. The severity could be Intervention (least severe), Notice, Error, Critical (most severe). |
Reason | The reason why the listed data object is inaccessible. Some of the possible reasons could be:
|
Date | The date and time when the report is generated. |
Note
Due to some known limitations, you might see same data objects listed in both Sensitive and Inaccessible tabs of the aggregated reports.
Examples:
- A partially scanned file (possibly, due to insufficient buffer memory).
- A table with partially scanned rows (due to some password protected content).
Additional Metadata Fields in Reports
The following table shows the complete list of metadata fields that can be displayed in reports.
Note
The number of fields displayed on the GUI depends on the level of details selected in the scan configuration, data store, and file types.
For MongoDB and Azure Table, no metadata is displayed in Aggregated reports.
Key | Description |
---|---|
Catalog | Name of the database or catalog. |
Classification Status | MIP classification status for the matched object. |
Client Modified | Client modification timestamp. |
Date | Date for the resource. |
Date Modified | Date when the resource is modified. |
Document Created | Date when the document is created. |
Document Creator | Name of the document creator. |
Document Modified | Date when the document is last modified. |
Document Modifier | User who last modified the document. |
Encoding | Whether the match is found in an EBCDIC-encoded resource. |
File Created | Date when the file is created. |
File Modified | Date when the file is last modified. |
File Owner | Owner of the file. |
Filename | File name for the resource. |
Folder | Folder name for the resource. |
Instance | Name of the database instance. |
Key Columns | Name of the column(s) used as a key in the database scan. If multiple columns are used as the key, the column names will be comma-separated. |
Key Source | Source of the key used in the "Key" and "Column:Key" metadata (for example, whether the key is obtained from a primary key column, or unique key column etc.). Enum: Primary Key, Integer Unique Column, String Unique Column, Blob * , Integer Non Unique Column, String Non Unique Column• Blob is the default "Key Source" metadata value for matches detected in BLOB objects as no key column information is stored for BLOB objects. |
MIP Label Description | Description for the MIP classification label applied to the matched object. |
MIP Label Name | Name of the MIP classification label applied to the matched object. |
MIP Label UID | ID of the MIP classification label applied to the matched object. |
MIP Label Sensitivity | Sensitivity level of the MIP classification label applied to the matched object. |
Object Created | Date when the Google Cloud Storage object is created. |
Object Modified | Date when the Google Cloud Storage object is last modified. |
Permission Execute | List of groups, users, or user classes that have execute permissions to the matched object. |
Permission Full | List of groups, users, or user classes that have full permissions to the matched object. |
Permission Modify | List of groups, users, or user classes that have modify permissions to the matched object. |
Permission Read | List of groups, users, or user classes that have read permissions to the matched object. |
Permission Special | List of groups, users, or user classes that have special permissions to the matched object. |
Permission Write | List of groups, users, or user classes that have write permissions to the matched object. |
Processed Rows | Number of rows that were scanned for the table in a database scan. |
Schema | Name of the database schema. |
Server Modified | Date and time of the last modification by the server. Applicable only to the Dropbox Business Target types. |
Table | Name of the database table in a database. |
Track 1 | Whether a Track 1 data type was detected. |
Track 2 | Whether a Track 2 data type was detected. |
To return to the Reports page, click the All Reports link at the top of the report details page (above the report name).
Information About Scan Filters
You can view the information about any filters that might have been applied in scans. This is achieved in the Scans tab by clicking the arrow icon on the left of the report name to expand it. The section that appears will display the information about the number and types of filters applied. For example, you can expect to see information like this:
1 Scan Filter Exclude locations greater than file size ................................. 14000 MB
As you can glean from this message, one filter was applied on the scan: "Exclude locations greater than file size".
Printing the Report Details
Click the Print Preview button in the top right corner of the screen and then Print. The report will be saved in PDF format to the location that you selected. To return to the report, click the < Exit Print View link in the top left corner of the screen.
Note
For the best experience of exporting reports to PDF use Chrome or Firefox.
Although A4 and portrait settings are supported, it is recommended to use A3 and landscape settings as print settings to avoid printing distorted charts.
Risk Score
The risk score reflects the level of the risk to the business that would result from the exposure of the sensitive data objects (i.e. sensitive information) found by a scan. The lower the risk score number the lower the risk. The risk score will depend on the type of sensitive data object found and the number of such objects found. For example, a risk score for a single email address found by a scan will be 10. Obviously, for a document containing thousands of email addresses found during a scan, the reported risk score will be many times that.
Note
Only a complete removal/deletion of a sensitive data object would reduce the risk score to zero.
Removing Reports
You can remove a report in the Reports screen. Since reports have no dependencies (i.e. do not affect other resources) you can remove them without problems.
Note
Only users with the right permissions can remove reports, that is Admin, DDC Admin, DDC Report Admin, and DDC Full Report Admin.
To remove a report follow these steps:
Click the overflow icon () corresponding to the desired report.
In the shortcut menu that is displayed, select the Remove option.
A warning message "Remove Report? Are you sure you want to remove this report? This cannot be undone." is displayed.
Confirm the report removal by clicking the Remove button in the warning message dialogue box. To cancel the report removal, click the Cancel button.
Note
After deleting a report, you can create another report with the same name as the one that you deleted.
Reports are not deleted in HDFS, which means that if you have the URL of the removed report, with the report ID, you can still view the report after you removed it.
Exporting Report's Data Objects
You can export all the data objects of a report as newline-delimited JSON (NDJSON) format. You can then view the exported ndjson file in any editor supporting this format.
There are two ways of exporting those data objects:
directly through the Reports page, using the Export D.O. in the report's contextual menu,
through the Report Details page, Data Objects tab, using the Export Data Objects button.
To export the data objects associated with a report from the Reports page:
Click the overflow icon () corresponding to the desired report.
Click Export D.O. in the contextual menu that is displayed.
Choose the target location for the exported file.
To export data objects associated with a report from its Report Details page:
Click the overflow icon () corresponding to the desired report.
Click View in the contextual menu that is displayed.
On the report details page, click the Data Objects tab.
Click Export Data Objects to export the data objects.
Choose the target location for the exported file.
Tip
Please check the ELK Reference to see how to use the exported data.
Note
When you export the data objects of a database data store of a report, the exported NDJSON file also contains sensitive columns that were extracted from the scan results. The list of sensitive columns gathered in the exported file could be partial. The number of columns in the list depends on the scan configuration and the number of sensitive data matches found.
Exporting Report's Inaccessible Data Objects
You can export the inaccessible data objects of a report as newline-delimited JSON (NDJSON) format. You can then view the exported ndjson file in any editor supporting this format.
To export the inaccessible data objects associated with a report from its Report Details page:
Click the overflow icon () corresponding to the desired report.
Click View in the contextual menu that is displayed.
On the report details page, click the Inaccessible Data Objects tab.
Click Export Data Objects to export the data objects.
The inaccessible data objects report is downloaded as a .ndjson file.