Modify a keyword dictionary

You might need to modify keywords in one of your keyword dictionaries, or modify one of the built-in dictionaries. You can do this through PowerShell or through the Microsoft Purview portal or the Microsoft Compliance portal.

Tip

If you're not an E5 customer, use the 90-day Microsoft Purview solutions trial to explore how additional Purview capabilities can help your organization manage data security and compliance needs. Start now at the Microsoft Purview compliance portal trials hub. Learn details about signing up and trial terms.

Modify a keyword dictionary in the portals

Keyword dictionaries can be used as Primary elements or Supporting elements in sensitive information type (SIT) patterns. You can edit a keyword dictionary while creating a SIT or in an existing custom SIT. For example to edit an existing keyword dictionary:

Select the appropriate tab for the portal you're using. To learn more about the Microsoft Purview portal, see Microsoft Purview portal. To learn more about the Compliance portal, see Microsoft Purview compliance portal.

  1. Sign in to the Microsoft Purview portal > Information Protection > Classifiers > Sensitive info types.

  2. Choose the SIT that uses the keyword dictionary you want to update.

  3. Select Edit.

  4. Choose Next.

  5. Edit the pattern that has the keyword dictionary you want to update.

  6. Edit the keyword dictionary in the Primary element or the Supporting element sections. Make your edits, using one keyword per line.

    screenshot edit keywords.

  7. Choose Done.

Modify a keyword dictionary using PowerShell

For example, we'll modify some terms in PowerShell, save the terms locally where you can modify them in an editor, and then update the previous terms in place.

First, retrieve the dictionary object:

$dict = Get-DlpKeywordDictionary -Name "Diseases"

Printing $dict will show the various properties. The keywords themselves are stored in an object on the backend, but $dict.KeywordDictionary contains a string representation of them, which you'll use to modify the dictionary.

Before you modify the dictionary, you need to turn the string of terms back into an array using the .split(',') method. Then you'll clean up the unwanted spaces between the keywords with the .trim() method, leaving just the keywords to work with.

$terms = $dict.KeywordDictionary.split(',').trim()

Now you'll remove some terms from the dictionary. Because the example dictionary has only a few keywords, you could as easily skip to exporting the dictionary and editing it in Notepad, but dictionaries generally contain a large amount of text, so you'll first learn this way to edit them easily in PowerShell.

In the last step, you saved the keywords to an array. There are several ways to remove items from an array, but as a straightforward approach, you'll create an array of the terms you want to remove from the dictionary, and then copy only the dictionary terms to it that aren't in the list of terms to remove.

Run the command $terms to show the current list of terms. The output of the command looks like this:

aarskog's syndrome
abandonment
abasia
abderhalden-kaufmann-lignac
abdominalgia
abduction contracture
abetalipoproteinemia
abiotrophy
ablatio
ablation
ablepharia
abocclusion
abolition
aborter
abortion
abortus
aboulomania
abrami's disease

Run this command to specify the terms that you want to remove:

$termsToRemove = @('abandonment','ablatio')

Run this command to actually remove the terms from the list:

$updatedTerms = $terms | Where-Object {$_ -notin $termsToRemove}

Run the command $updatedTerms to show the updated list of terms. The output of the command looks like this (the specified terms have been removed):

aarskog's syndrome
abasia
abderhalden-kaufmann-lignac
abdominalgia
abduction contracture
abetalipoproteinemia
abiotrophy
ablation
ablepharia
abocclusion
abolition
aborter
abortion
abortus
aboulomania
abrami's disease

Now save the dictionary locally and add a few more terms. You could add the terms right here in PowerShell, but you'll still need to export the file locally to ensure it's saved with Unicode encoding and contains the BOM.

Save the dictionary locally by running the following:

Set-Content $updatedTerms -Path "C:\myPath\terms.txt"

Now open the file, add your other terms, and save with Unicode encoding (UTF-16). Now you'll upload the updated terms and update the dictionary in place.

Set-DlpKeywordDictionary -Identity "Diseases" -FileData ([System.IO.File]::ReadAllBytes('C:myPath\terms.txt'))

Now the dictionary has been updated in place. The Identity field takes the name of the dictionary. If you wanted to also change the name of your dictionary using the Set- cmdlet, you would just need to add the -Name parameter to what's above with your new dictionary name.

See also