Data Loss Prevention (DLP) in SharePoint 2016 and SharePoint Online
Editor’s note: The following post was written by Office Servers and Services MVP Steve Smith as part of our Technical Tuesday series—on bonus Wednesday!
In this article I am going to introduce you to the latest updates in the compliancy space around data loss prevention (DLP) in the new SharePoint 2016 public beta 2 release http://www.microsoft.com/en-us/download/details.aspx?id=49961 which is now available to download. The information from this article is also part of my SharePoint 2016 clinic designed to get people up to speed with SharePoint 2016 Beta 2 and will also form part of my upcoming 5 day 2016 Administrators classes in the UK and US. UK classes - http://www.combined-knowledge.com/Courses/Classroom/SharePoint_2016_clinic/index.html , US classes - https://mindsharp.com/course/sharepoint-2016-clinic/
I am also going to discuss DLP from a SharePoint Online perspective as well as on Prem, SharePoint Online being part of Office365 and also has DLP features for SharePoint online rolling out to tenants now. It is important to note however that DLP is not new in itself, the DLP features have been part of Exchange Server 2013 and Exchange online to allow you to build message driven policies for email. Having DLP in policies in SharePoint now allows a business to build a DLP structure across both email and data which is great news for all regardless if you are On Prem or in Office365.
I have been working with the document and records management features of SharePoint for many years and the first clarification I want to make is that DLP is not a replacement for those existing processes. In fact DLP is very much a compliment to your business’s overall strategy with how to handle compliant and sensitive data within your SharePoint environment. DLP is not replacing your document lifecycle management process but it is allowing your business to build a policy model to discover and protect data in a way previously not possible out of the box.
So what is DLP then I hear you ask, In a nutshell it is a method to discover (find) and restrict sensitive data being put into SharePoint that matches specific criteria through defined industry templates and thus avoid breaches of corporate data leaving the company. Such data could include credit card details or employee national insurance or social security information and they are specific to regional requirements. These 80 templates are the same ones being used by Exchange and the full list of templates can be found here https://technet.microsoft.com/en-us/library/jj150541(v=exchg.160).aspx although the SharePoint Beta 2 bits do not yet include the full set of templates I am sure by RTM there will be the full selection.
Although the examples I am using in this article are built in SharePoint 2016 Beta 2 on Prem you can follow along in exactly the same way in SharePoint online. The only difference is that in SharePoint online you cannot force content crawls so you may have to wait longer in order for the search results to show up.
Figure 1 - DLP Policy templates
If you expand these policies shown on the website linked before you will notice that each policy has a defined criteria that uses patterns and confidence levels to match data in the document in order to trigger the DLP policy to take action against the document. You will also notice that each template has specific keywords that form part of the detection criteria. The aim here is to flag items that clearly breach the rules of a policy and not flag items that may include certain keywords but have no legal implication. For example a sales person has a document in SharePoint that outlines to a client that they can pay via credit card. The keywords of credit card in this scenario do not warrant it being locked down by a DLP policy, but someone storing 50 custom credit card details in SharePoint clearly would. As you can see by the credit card template you have both keyword verification and keyword name to include card numbers as well as card type so in order for these templates to be triggered there must be clear matches against the template criteria.
Figure 2- DLP template criteria for credit cards
Before we start creating some DLP policies I first want to break down the two main options that you have in SharePoint around DLP and that we are going to look at later in this article. The two main elements are:
An important point to mention here is that both of these options do apply to both items stored in SharePoint and Items stored in OneDrive.
As a company you may not actually know how many items are currently in your organisation’s SharePoint data that are actually in breach of your own compliancy regulations. Having the ability to do a DLP query based on specific DLP templates across all your SharePoint data will allow you to quickly identity areas that are in need of managed policies and fixing the existing breaches. This discovery process relies 100% on search having crawled the items in SharePoint. In SharePoint 2016 In SharePoint 2016 Beta 2 there is a new addition to the eDiscovery site template called a DLP query that allows a user to launch queries against DLP templates against all your content or specific content in your SharePoint environment. One important aspect to this however is that the person who is running the query in the eDiscovery Center must have read access to all data in SharePoint. This can be achieved either via a Web Application Policy on Prem, or by adding them as site collection administrators in SharePoint Online or On Prem.
The obvious way to avoid sensitive data being available to others is to put in place a policy that restricts the document itself when it is put into SharePoint. A DLP policy enables the compliancy managers to create these policies and apply them to site collections in your SharePoint environment which can include policy tips, email notifications and blocking of the content once it matches a specific DLP policy template.
That is the terminology dealt with let’s now get started by testing the Discovery process and using the new DLP Query in the eDiscovery Center. In this example I have built my SharePoint Farm with 3 servers, one Server is the domain controller on Windows Server 2012R2, 1 is the SQL server using SQL 2014 and the third server is a SharePoint 2016 Server beta 2 in custom role. I have also tested DLP full in a Min Role farm with 9 SharePoint Servers so the same method applies regardless of your SharePoint deployment method. I have also created the User Profile Application Service for creating user my sites and personal OneDrives as well as a Search Service Application for crawling data.
The first thing you want to do is get a document ready to test that the DLP Query is working correctly. For this example I am going to use a generic credit card list which you can obtain from here http://www.paypalobjects.com/en_US/vhelp/paypalmanager_help/credit_card_numbers.htm and copy the table into your own word document and save. You now need to upload the document into a document library, in my test I am going to upload one into a team site and one into my personal OneDrive.
Figure 3 - Document added to team site library
Figure 4 - Document added to OneDrive
Now that you have added your documents with the credit card data into SharePoint you now need to do an update crawl of the data so that the new documents are now in the SharePoint Index that a user runs a query against. This is achieved in the search service application for your SharePoint content source. An incremental crawl is fine.
Figure 5 - Running an incremental crawl to update the Index
Part of the crawl process is to analyse the content through the content processing component and part of this process includes a new component in SharePoint 2016 called the Classification Operator. Along with other processing components such as word breakers and document parser. Once processed the classification results are stored in the Index ready for a query to be used against it.
Once the crawl has finished you can proceed to create a new site collection that uses the eDiscovery site template. This is done via Central Administration or PowerShell or if you are in Office 365 you can create a new site collection via the SharePoint Admin site in your tenant Admin page. When creating your new site collection ensure that you select eDiscovery Center which can be found in the Enterprise tab.
**Note** There is no limit to the amount of eDiscovery sites that you create in you organisation you simply control access to each one via the site permissions.
Figure 6 - Select the eDiscovery template
Ensure that the users who will be managing the eDiscovery sites and are going to be running queries against SharePoint data have got read permissions to the site collections that they will be searching for content. Once your site is created go ahead and launch the site logging in as your chosen user.
On the home page for the site you have the option of creating a new discovery case or just a DLP query. The scope of covering eDiscovery cases is too much for this article but a good starting point is here if you want to know more https://technet.microsoft.com/en-us/library/fp161516.aspx . For this demo I am going to select ‘Create DLP Query’
Figure 7 - The eDiscovery home page
In the ‘Search and Export’ page you need to create a ‘New Item’. In the new item page you will now see the DLP templates that we mentioned earlier. For this test we are going to select U.K. Financial Data as that is the data we copied earlier for the credit cards.
Figure 8 - New DLP Query
You will also notice on this template selection page that you now have the choice to choose how many instances of the particular sensitive data type you need to capture in the document before it is captured in the query. For example if you want to be shown all documents in SharePoint that have 2 or more credit card numbers you need to change this box to 2. Obviously the lower the number the more potential false positives you could capture. It all depends on what the level of identification is needed for your company. For this example I will leave it at 2.
Figure 9 - Choose the query match trigger amount
Clicking next takes you to the actual DLP query page. On this page you will need to define a name for the query which will be saved for later use and also the source of where you want to run the query against. The Source can be specific SharePoint site collections or simply all of SharePoint. So give your query a title and then in the sources section click on ‘Modify Query Scope’
Figure 10 - Modify the search scope
**Note** It is possible to amend the query at any time, so if for example you wish to change the query string to be 5 or more hits instead of the 2 we previously defined then you can edit the query in this page at any time, for example: SensitiveType="Credit Card Number|5.." also notice in the query string that we have added standard query language to include other options in this case EU Debit Card and SWIFT Code.
On the modify Query Scope page select ‘Search Everything in SharePoint’ because at this stage we don’t actually know if we have any breaches in the business so we would like to first find data anywhere that matches our query type.
Figure 11 - Choose Everything in SharePoint
Click OK to and now you can click on the Search button to run the query.
Figure 12 - Searching for content
Once the query has finished you should now get a return of the documents that you uploaded earlier into your sites or in my case my project site and my personal OneDrive.
Figure 13 - Seeing the results
So now I have been able to discover exactly where data is being stored that is clearly in breach of my corporate policy. The next step is to start applying a DLP policy to the site collections I want to control. In order to create a policy we first need to create another new site collection that allows me to create and manage policies plus assign them to user site collections.
Again we do this in Central Administration or PowerShell on Prem or via the SharePoint admin page in Office 365 Admin. This time we selecting a template in the Enterprise tab you need to select ‘Compliance Policy Center’. You will notice there is also another new template here for In-Place Hold, which is something for another article J
Just like the eDiscovery site you can have as many Compliance Policy Center’s as you need just control access via the site permissions.
Figure 14 - Create a Compliancy Policy Center
Once the Site Collection has been created you can now browse to it and you will see there is two distinct sections. Delete Policies and Data Loss Prevention Policies. Both play a different role in data management, one obviously deleting content already in SharePoint based on a defined policy and the other managing content via a policy that is being put into SharePoint. The one that we care about is obviously the DLP Policy.
Figure 15 - The Compliancy Policy Center Home Page
There are two sides to a DLP policy:
- Policy Management
- Create and define the policy logic
- Policy Assignments for Site Collections
- Assign a policy to specific Site Collections to enforce the policy.
Let’s first look at Policy management and creating our first policy. Click on DLP policy Management to open the management list. From here you need to click on ‘new item’
Figure 16 - New DLP Policy Management Item
Let’s use our UK credit card scenario again, so I will call this policy UKCCFraud for example and select the same U.K. Financial Data template that we used in the previous eDiscovery query.
Change the number of instances that you want to define in order to trigger the policy. I will use 2 again in this instance as I know that works based on my previous query. You also need to define a user who will receive incident reports, this could be an email address that is seen by several high ranking legal people in your organisation for example or a compliancy officer.
There is now two additional options, neither are mandatory but definitely useful. The first option is to add policy tips. This updates the count with additional information when a user goes to edit an item that is not compliant stating for example that it is in breach of a policy. This policy tip can be shown in Office, in Item preview or through Office Online applications. We will look at this in more detail shortly. The second option requires policy tips to be on but also blocks content for normal users to view the content in the SharePoint library. Only the content owner and site owners can now see the item that is in breach of the policy and it will not be viewable for all users until the content in question is edited or changed. For the purpose of this exercise I will be selecting both.
Figure 17 - Defining Policy options
Just click Save to now store the policy.
Now that the policy has been created the second part of the configuration is to align it to any site collections where we need the policy enforcing for SharePoint data. In my example that is a site collection called projects. Obviously choose one of your own to test this against.
As usual I do need to emphasise that you should be testing this first in a test environment, not in production whilst you get familiar with the technology J
In the Data Loss Prevention home you now need to go to DLP Policy Assignments for Site Collections. On this page you now need to create a ‘new item’.
Figure 18 - select Site Collections
On the Site Collection Assignment page you first need to choose a site collection that you want to assign a policy to. Click on ‘First choose a site collection’ and then in the site collection field enter the full URL to your chosen site collection. In my example this is https://intranet.combined.com/projects and then click on the search icon to resolve the site collection URL. Just like the DLP Query you must have crawled your site collections in order to find them.
Figure 19 - Enter a site collection URL and search for it
Once the search has resolved your site collection you should simply tick the box for the locations you want to apply the policy. If you have selected a root site collection all site collections below that path will be shown. For example: If you select a root site collection of https://intranet.combined.com and other site collections from the managed paths will be shown, such as https://intranet.combined.com/sites/alpha and https://intranet.combined.com/sites/beta
Figure 20 - Choosing from multiple site collections if using a root
Now that you have selected your site collection you now need to assign it a policy. In the same Site Collection Assignment page click on Manage Assigned Policies. This now allows you to choose from one of your DLP management policies. In our case we created the credit card fraud one earlier so I will select that. Select it and click save to apply the policy.
Figure 21 - Assign a managed policy to a site collection
Finally click on save to apply the site collection policy. At this point you can rinse and repeat for as many site collections as you wish. You will notice below that I also choose my users my site host which allows me select any of my users personal sites which would also include OneDrive data on a per user basis.
Figure 22 - Assigning DLP Policies to users personal sites
Now that you have applied your policies to your site collections we should be able to see the effect by browsing back to our team site and refreshing the data. In my case I went back to my projects site and you clearly see a new icon over the document that failed the policy test and has been marked blocked due to being in conflict with a policy.
Figure 23 - DLP Policy applying to a document
You can also follow the link to see the policy tip to get more information on the policy breach and then open the item to resolve it.
Remember that the only people who can see these options are the site owners and the document owner, other user don’t even see the document in question.
Figure 24 - DLP Policy Tip
If you select ‘Resolve’ then an additional box appears that allows you to either override the Policy which could have legal ramifications on you personally or you can report the issue to a higher administrator and continue.
So there you have it, a comprehensive look at some of the new features of DLP in SharePoint 2016 Beta 2 and for those of you with the DLP feature available in your Office 365 tenant you can follow the same steps. Combine DLP with SharePoint along with DLP in Exchange and you have a very solid base for managing your corporate DLP strategy. I hope you have enjoyed following this article and I would love to hear from you and how you are getting on.
About the author
Steve (@stevesmithck) is the owner of Combined Knowledge in the UK and Mindsharp in the US, SharePoint education and productivity companies since 2003 and has been a SharePoint MVP for the last 10 years, an MCT for 18 years and has recently been made a Microsoft Regional Director. Steve is also the founder of the UK SharePoint User Group now in its 10th year which is the largest active in person SharePoint User Group in the world with meetings around the UK throughout the year and is free to everyone: http://www.suguk.org/