Data loss prevention reference
Microsoft 365 compliance is now called Microsoft Purview and the solutions within the compliance area have been rebranded. For more information about Microsoft Purview, see the blog announcement and the What is Microsoft Purview? article.
This is reference topic is no longer the main resource for Microsoft Purview Data Loss Prevention (DLP) information. The DLP content set is being updated and restructured. The topics covered in this article will be moving to new, updated articles. For more information about DLP, see Learn about data loss prevention.
Data loss prevention capabilities were recently added to Microsoft Teams chat and channel messages for users licensed for Office 365 Advanced Compliance, which is available as a standalone option and is included in Office 365 E5 and Microsoft 365 E5 Compliance. To learn more about licensing requirements, see Microsoft 365 Tenant-Level Services Licensing Guidance.
Create and manage DLP policies
You create and manage DLP policies on the data loss prevention page in the Microsoft Purview compliance portal.
Tuning rules to make them easier or harder to match
After people create and turn on their DLP policies, they sometimes run into these issues:
Too much content that is not sensitive information matches the rules — in other words, too many false positives.
Too little content that is sensitive information matches the rules. In other words, the protective actions aren't being enforced on the sensitive information.
To address these issues, you can tune your rules by adjusting the instance count and match accuracy to make it harder or easier for content to match the rules. Each sensitive information type used in a rule has both an instance count and match accuracy.
Instance count means simply how many occurrences of a specific type of sensitive information must be present for content to match the rule. For example, content matches the rule shown below if between 1 and 9 unique U.S. or U.K. passport numbers are identified.
The instance count includes only unique matches for sensitive information types and keywords. For example, if an email contains 10 occurrences of the same credit card number, those 10 occurrences count as a single instance of a credit card number.
To use instance count to tune rules, the guidance is straightforward:
To make the rule easier to match, decrease the min count and/or increase the max count. You can also set max to any by deleting the numerical value.
To make the rule harder to match, increase the min count.
Typically, you use less restrictive actions, such as sending user notifications, in a rule with a lower instance count (for example, 1-9). And you use more restrictive actions, such as restricting access to content without allowing user overrides, in a rule with a higher instance count (for example, 10-any).
As described above, a sensitive information type is defined and detected by using a combination of different types of evidence. Commonly, a sensitive information type is defined by multiple such combinations, called patterns. A pattern that requires less evidence has a lower match accuracy (or confidence level), while a pattern that requires more evidence has a higher match accuracy (or confidence level). To learn more about the actual patterns and confidence levels used by every sensitive information type, see Sensitive information type entity definitions.
For example, the sensitive information type named Credit Card Number is defined by two patterns:
A pattern with 65% confidence that requires:
A number in the format of a credit card number.
A number that passes the checksum.
A pattern with 85% confidence that requires:
A number in the format of a credit card number.
A number that passes the checksum.
A keyword or an expiration date in the right format.
You can use these confidence levels (or match accuracy) in your rules. Typically, you use less restrictive actions, such as sending user notifications, in a rule with lower match accuracy. And you use more restrictive actions, such as restricting access to content without allowing user overrides, in a rule with higher match accuracy.
It's important to understand that when a specific type of sensitive information, such as a credit card number, is identified in content, only a single confidence level is returned:
If all of the matches are for a single pattern, the confidence level for that pattern is returned.
If there are matches for more than one pattern (that is, there are matches with two different confidence levels), a confidence level higher than any of the single patterns alone is returned. This is the tricky part. For example, for a credit card, if both the 65% and 85% patterns are matched, the confidence level returned for that sensitive information type is greater than 90% because more evidence means more confidence.
So if you want to create two mutually exclusive rules for credit cards, one for the 65% match accuracy and one for the 85% match accuracy, the ranges for match accuracy would look like this. The first rule picks up only matches of the 65% pattern. The second rule picks up matches with at least one 85% match and can potentially have other lower-confidence matches.
For these reasons, the guidance for creating rules with different match accuracies is:
The lowest confidence level typically uses the same value for min and max (not a range).
The highest confidence level is typically a range from just above the lower confidence level to 100.
Any in-between confidence levels typically range from just above the lower confidence level to just below the higher confidence level.
Using a retention label as a condition in a DLP policy
When you use a previously created and published retention label as a condition in a DLP policy, there are some things to be aware of:
The retention label must be created and published before you attempt to use it as a condition in a DLP policy.
Published retention labels can take from one to seven days to sync. For more information, see When retention labels become available to apply for retention labels published in a retention policy, and How long it takes for retention labels to take effect for retention labels that are auto-published.
Using a retention label in a policy is only supported for items in SharePoint and OneDrive*.
You might want to use a retention label in a DLP policy if you have items that are under retention and disposition, and you also want to apply other controls to them, for example:
- You published a retention label named tax year 2018, which when applied to tax documents from 2018 that are stored in SharePoint retains them for 10 years then disposes of them. You also don't want those items being shared outside your organization, which you can do with a DLP policy.
You'll get this error if you specify a retention label as a condition in a DLP policy and you also include Exchange and/or Teams as a location: "Protecting labeled content in email and teams messages isn't supported. Either remove the label below or turn off Exchange and Teams as a location." This is because Exchange transport does not evaluate the label metadata during message submission and delivery.
Using a sensitivity label as a condition in a DLP policy
Learn more about using Sensitivity label as a condition in DLP policies.
How this feature relates to other features
Several features can be applied to content containing sensitive information:
A retention label and a retention policy can both enforce retention actions on this content.
A DLP policy can enforce protection actions on this content. And before enforcing these actions, a DLP policy can require other conditions to be met in addition to the content containing a label.
Note that a DLP policy has a richer detection capability than a label or retention policy applied to sensitive information. A DLP policy can enforce protective actions on content containing sensitive information, and if the sensitive information is removed from the content, those protective actions are undone the next time the content's scanned. But if a retention policy or label is applied to content containing sensitive information, that's a one-time action that won't be undone even if the sensitive information is removed.
By using a label as a condition in a DLP policy, you can enforce both retention and protection actions on content with that label. You can think of content containing a label exactly like content containing sensitive information - both a label and a sensitive information type are properties used to classify content, so that you can enforce actions on that content.
Simple settings vs. advanced settings
When you create a DLP policy, you'll choose between simple or advanced settings:
Simple settings make it easy to create the most common type of DLP policy without using the rule editor to create or modify rules.
Advanced settings use the rule editor to give you complete control over every setting for your DLP policy.
Don't worry, under the covers, simple settings and advanced settings work exactly the same, by enforcing rules comprised of conditions and actions—only with simple settings, you don't see the rule editor. It's a quick way to create a DLP policy.
By far, the most common DLP scenario is creating a policy to help protect content containing sensitive information from being shared with people outside your organization, and taking an automatic remediating action such as restricting who can access the content, sending end-user or admin notifications, and auditing the event for later investigation. People use DLP to help prevent the inadvertent disclosure of sensitive information.
To simplify achieving this goal, when you create a DLP policy, you can choose Use simple settings. These settings provide everything you need to implement the most common DLP policy, without having to go into the rule editor.
If you need to create more customized DLP policies, you can choose Use advanced settings.
The advanced settings present you with the rule editor, where you have full control over every possible option, including the instance count and match accuracy (confidence level) for each rule.
To jump to a section quickly, click an item in the top navigation of the rule editor to go to that section below.
DLP policy templates
The first step in creating a DLP policy is choosing what information to protect. By starting with a DLP template, you save the work of building a new set of rules from scratch, and figuring out which types of information should be included by default. You can then add to or modify these requirements to fine tune the rule to meet your organization's specific requirements.
A preconfigured DLP policy template can help you detect specific types of sensitive information, such as HIPAA data, PCI-DSS data, Gramm-Leach-Bliley Act data, or even locale-specific personally identifiable information (P.I.). To make it easy for you to find and protect common types of sensitive information, the policy templates included in Microsoft 365 already contain the most common sensitive information types necessary for you to get started.
Your organization may also have its own specific requirements, in which case you can create a DLP policy from scratch by choosing the Custom policy option. A custom policy is empty and contains no premade rules.
After you create and turn on your DLP policies, you'll want to verify that they're working as you intended and helping you stay compliant. With DLP reports, you can quickly view the number of DLP policy and rule matches over time, and the number of false positives and overrides. For each report, you can filter those matches by location, time frame, and even narrow it down to a specific policy, rule, or action.
With the DLP reports, you can get business insights and:
Focus on specific time periods and understand the reasons for spikes and trends.
Discover business processes that violate your organization's compliance policies.
Understand any business impact of the DLP policies.
In addition, you can use the DLP reports to fine tune your DLP policies as you run them.
How DLP policies work
DLP detects sensitive information by using deep content analysis (not just a simple text scan). This deep content analysis uses keyword matches, dictionary matches, the evaluation of regular expressions, internal functions, and other methods to detect content that matches your DLP policies. Potentially only a small percentage of your data is considered sensitive. A DLP policy can identify, monitor, and automatically protect just that data, without impeding or affecting people who work with the rest of your content.
Policies are synced
After you create a DLP policy in the Microsoft Purview compliance portal, it's stored in a central policy store, and then synced to the various content sources, including:
Exchange Online, and from there to Outlook on the web and Outlook.
OneDrive for Business sites.
SharePoint Online sites.
Office desktop programs (Excel, PowerPoint, and Word).
Microsoft Teams channels and chat messages.
After the policy's synced to the right locations, it starts to evaluate content and enforce actions.
Policy evaluation in OneDrive for Business and SharePoint Online sites
Across all of your SharePoint Online sites and OneDrive for Business sites, documents are constantly changing — they're continually being created, edited, shared, and so on. This means documents can conflict or become compliant with a DLP policy at any time. For example, a person can upload a document that contains no sensitive information to their team site, but later, a different person can edit the same document and add sensitive information to it.
For this reason, DLP policies check documents for policy matches frequently in the background. You can think of this as asynchronous policy evaluation.
How it works
As people add or change documents in their sites, the search engine scans the content, so that you can search for it later. While this is happening, the content's also scanned for sensitive information and to check if it's shared. Any sensitive information that's found is stored securely in the search index, so that only the compliance team can access it, but not typical users. Each DLP policy that you've turned on runs in the background (asynchronously), checking search frequently for any content that matches a policy, and applying actions to protect it from inadvertent leaks.
Finally, documents can conflict with a DLP policy, but they can also become compliant with a DLP policy. For example, if a person adds credit card numbers to a document, it might cause a DLP policy to block access to the document automatically. But if the person later removes the sensitive information, the action (in this case, blocking) is automatically undone the next time the document is evaluated against the policy.
DLP evaluates any content that can be indexed. For more information on what file types are crawled by default, see Default crawled file name extensions and parsed file types in SharePoint Server.
In order to prevent documents from being shared before DLP policies had the opportunity to analyze them, sharing of new files in SharePoint can be blocked until its content has been indexed. See, Mark new files as sensitive by default for detailed information.
Policy evaluation in Exchange Online, Outlook, and Outlook on the web
When you create a DLP policy that includes Exchange Online as a location, the policy's synced from the Microsoft Purview compliance portal to Exchange Online, and then from Exchange Online to Outlook on the web and Outlook.
When a message is being composed in Outlook, the user can see policy tips as the content being created is evaluated against DLP policies. And after a message is sent, it's evaluated against DLP policies as a normal part of mail flow, along with Exchange mail flow rules (also known as transport rules) and DLP policies created in the Exchange admin center. DLP policies scan both the message and any attachments.
Policy evaluation in the Office desktop programs
Excel, PowerPoint, and Word include the same capability to identify sensitive information and apply DLP policies as SharePoint Online and OneDrive for Business. These Office programs sync their DLP policies directly from the central policy store, and then continuously evaluate the content against the DLP policies when people work with documents opened from a site that's included in a DLP policy.
DLP policy evaluation in Office is designed not to affect the performance of the programs or the productivity of people working on content. If they're working on a large document, or the user's computer is busy, it might take a few seconds for a policy tip to appear.
Policy evaluation in Microsoft Teams
When you create a DLP policy that includes Microsoft Teams as a location, the policy's synced from the Microsoft Purview compliance portal to user accounts and Microsoft Teams channels and chat messages. Depending on how DLP policies are configured, when someone attempts to share sensitive information in a Microsoft Teams chat or channel message, the message can be blocked or revoked. And, documents that contain sensitive information and that are shared with guests (external users) won't open for those users. To learn more, see Data loss prevention and Microsoft Teams.
By default, Global admins, Security admins, and Compliance admins will have access to create and apply a DLP policy. Other Members of your compliance team who will create DLP policies need permissions to the Microsoft Purview compliance portal. By default, your Tenant admin will have access to this location and can give compliance officers and other people access to the Microsoft Purview compliance portal, without giving them all of the permissions of a Tenant admin. To do this, we recommend that you:
Create a group in Microsoft 365 and add compliance officers to it.
Create a role group on the Permissions page of the Microsoft Purview compliance portal.
While creating the role group, use the Choose Roles section to add the following role to the Role Group: DLP Compliance Management.
Use the Choose Members section to add the Microsoft 365 group you created before to the role group.
You can also create a role group with view-only privileges to the DLP policies and DLP reports by granting the View-Only DLP Compliance Management role.
For more information, see Give users access to the Office 365 Compliance Center.
These permissions are required only to create and apply a DLP policy. Policy enforcement does not require access to the content.
Find the DLP cmdlets
To use most of the cmdlets for the Microsoft Purview compliance portal, you need to:
However, DLP reports need pull data from across Microsoft 365, including Exchange Online. For this reason, the cmdlets for the DLP reports are available in Exchange Online Powershell -- not in Microsoft Purview compliance portal Powershell. Therefore, to use the cmdlets for the DLP reports, you need to:
Use any of these cmdlets for the DLP reports: