question

sakuraime avatar image
0 Votes"
sakuraime asked SeeyaXi-msft commented

SQL Server 2019 Data discovery and classification .

SSMS has the ability to do the data classification MANUALLY .. may I know the discovery of the data also need to do in manually ?? I can't find any buttons allow me to do the data discovery , I can only see 'Data classification'

Actually apart from doing some labeling , classification , auditing ( data_sensitivity_information) , give some reports of how many data classified manually... what's the actual other benefit for this feature ?

and anyone knows how to parse the audit event column 'data_sensitive_information' from the audit file ? the result is xml and would like to further expand to column

112862-image.png


sql-server-general
image.png (126.8 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @sakuraime ,

We have not received a response from you. Did the reply could help you? If the response helped, do "Accept Answer". If it doesn't work, please let us know the progress. By doing so, it will benefit all community members who are having this similar issue. Your contribution is highly appreciated.

0 Votes 0 ·
MartinCairney-6481 avatar image
0 Votes"
MartinCairney-6481 answered sakuraime commented

It depends on what you are interpreting as the meaning of "Discovery"?

The "Discovery" happens behind the scenes and looks for pre-defined patterns in the column names to identify those columns that may need a Sensitivity Label applied. To see what rules it is using and to customise these, you need to export the configuration as follows:
112779-classification.png

This exports as JSON - you can then edit this and use the same menu to Import the new rules and execute against that.

The "Classification" then groups those into the useful nomenclature (which you CAN customise for your own organisation).

The settings and storage of the data depends upon what version of SQL Server you have as well as the version of SSMS that you use. For SQL Server 2019, the metadata can also be added through TSQL using ADD SENSITIVITY CLASSIFICATION command - see here

The benefits of using this arise from some of the aspects you already called out:

AUDITING - you can see who has accessed sensitive data and whether they SHOULD have accessed this data. It can help you tighten up your security or help to identify a Data Breach - obviously a good thing.

METADATA - I have also used the ability to extract columns that are sensitive to produce further TSQL scripts to add either Data Masking or Encryption to all columns that have specific Sensitivity Labels - saves a lot of manual work.

To parse the column you need something like:

 WITH AuditWithXML as (
     SELECT event_time, action_id, database_name, statement, CAST(data_sensitivity_information as xml) as d
     FROM sys.fn_get_audit_file ('path_to_your_audit_file',default,default)
 )
 SELECT event_time, action_id, database_name, statement,
        h.ep.value('@label','nvarchar(100)') as [Label],
        h.ep.value('@information_type', 'nvarchar(100)') as [InformationType]
 FROM AuditWithXML
        OUTER APPLY d.nodes('/sensitivity_attributes/sensitivity_attribute') as h(ep);
 GO



classification.png (167.4 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

"The "Discovery" happens behind the scenes and looks for pre-defined patterns in the column names to identify those columns that may need a Sensitivity Label applied."


you mean behind the scenes , which mean how frequent ? daily or hourly ?
It just discover by the column name , instead of the 'actual data'??

0 Votes 0 ·
SeeyaXi-msft avatar image
0 Votes"
SeeyaXi-msft answered SeeyaXi-msft commented

Hi @sakuraime,

Try to understand discovery and classification as a continuous process.
You can apply SQL Data Discovery and Classification as written by Martin.
You can also manually classify columns as an alternative, or in addition, to the recommendation-based classification:
112880-1.png
For more information, please refer to MS docs: SQL Data Discovery and Classification
And this article also can help you unstand the benefit for this feature.

Best regards,
Seeya


If the response is helpful, please click "Accept Answer" and upvote it, as this could help other community members looking for similar queries.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


1.png (8.9 KiB)
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

as a continuous process.?? it's a thread inside sql server? is it possible to stop it ?

imaging the database have many tables/schema (more than hundread thousands of columns) , I question this kind of 'continuous process' will slower down the database.

tthanks


0 Votes 0 ·

Hi @sakuraime ,

What I mean is that they are all the capabilities of Data Discovery & Classification rather than a a continuous process. And sorry, i didn't express it clear.
Discovering and classifying your most sensitive data (business, financial, healthcare, etc.) can play a pivotal role in your organizational information protection stature. It can serve as infrastructure for:
Helping meet data privacy standards.
Monitoring access to databases/columns containing highly sensitive data.

Best regards,
Seeya

0 Votes 0 ·

so may I confirm the discovery start once I trigger 'Classfy data'???

0 Votes 0 ·
Show more comments