Native document support for Azure AI Language PII detection controlling Redact Categories

glen sale 41 Reputation points
2024-02-14T17:56:26.0933333+00:00

Hello, I was reading through these documentation:
https://learn.microsoft.com/en-us/azure/ai-services/language-service/native-document-support/use-native-documents?tabs=pii

Is there a way we can control the list of the words need to be redacted. For example, customer might ask that ''office" needs to be redacted or other list of words

 curl -k -i -X POST <key> -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key:<key>" -d '
{
    "kind": "PiiEntityRecognition",
    "parameters": {
        "modelVersion": "latest"
    },
    "analysisInput":{
        "documents":[
            {
                "id":"1",
                "language": "en",
                "text": "Call our office at 312-555-1234, or send an email to support@contoso.com"
            }
        ]
    }
}
'

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
359 questions
{count} votes

Accepted answer
  1. navba-MSFT 17,365 Reputation points Microsoft Employee
    2024-02-15T06:03:07.2366667+00:00

    @glen sale Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    Our standard (not customized) language service features are built on AI models that we call pre-trained or prebuilt models. We regularly update the language service with new model versions to improve model accuracy, support, and quality.

    Use this article to find the entity categories that can be returned by the PII detection feature.

    This feature runs a predictive model to identify, categorize, and redact sensitive information from an input document.

    Personally Identifiable Information (PII) detection supports the latest GA version: 2023-04-01 .

    So looking at the swagger of the language REST APi Specs. I could find the below supported PiiCategories enums:

    "PiiCategories": {
          "description": "(Optional) describes the PII categories to return",
          "items": {
            "type": "string",
            "x-ms-enum": {
              "name": "PiiCategory",
              "modelAsString": true
            },
            "enum": [
              "ABARoutingNumber",
              "ARNationalIdentityNumber",
              "AUBankAccountNumber",
              "AUDriversLicenseNumber",
              "AUMedicalAccountNumber",
              "AUPassportNumber",
              "AUTaxFileNumber",
              "AUBusinessNumber",
              "AUCompanyNumber",
              "ATIdentityCard",
              "ATTaxIdentificationNumber",
              "ATValueAddedTaxNumber",
              "AzureDocumentDBAuthKey",
              "AzureIAASDatabaseConnectionAndSQLString",
              "AzureIoTConnectionString",
              "AzurePublishSettingPassword",
              "AzureRedisCacheString",
              "AzureSAS",
              "AzureServiceBusString",
              "AzureStorageAccountKey",
              "AzureStorageAccountGeneric",
              "BENationalNumber",
              "BENationalNumberV2",
              "BEValueAddedTaxNumber",
              "BRCPFNumber",
              "BRLegalEntityNumber",
              "BRNationalIDRG",
              "BGUniformCivilNumber",
              "CABankAccountNumber",
              "CADriversLicenseNumber",
              "CAHealthServiceNumber",
              "CAPassportNumber",
              "CAPersonalHealthIdentification",
              "CASocialInsuranceNumber",
              "CLIdentityCardNumber",
              "CNResidentIdentityCardNumber",
              "CreditCardNumber",
              "HRIdentityCardNumber",
              "HRNationalIDNumber",
              "HRPersonalIdentificationNumber",
              "HRPersonalIdentificationOIBNumberV2",
              "CYIdentityCard",
              "CYTaxIdentificationNumber",
              "CZPersonalIdentityNumber",
              "CZPersonalIdentityV2",
              "DKPersonalIdentificationNumber",
              "DKPersonalIdentificationV2",
              "DrugEnforcementAgencyNumber",
              "EEPersonalIdentificationCode",
              "EUDebitCardNumber",
              "EUDriversLicenseNumber",
              "EUGPSCoordinates",
              "EUNationalIdentificationNumber",
              "EUPassportNumber",
              "EUSocialSecurityNumber",
              "EUTaxIdentificationNumber",
              "FIEuropeanHealthNumber",
              "FINationalID",
              "FINationalIDV2",
              "FIPassportNumber",
              "FRDriversLicenseNumber",
              "FRHealthInsuranceNumber",
              "FRNationalID",
              "FRPassportNumber",
              "FRSocialSecurityNumber",
              "FRTaxIdentificationNumber",
              "FRValueAddedTaxNumber",
              "DEDriversLicenseNumber",
              "DEPassportNumber",
              "DEIdentityCardNumber",
              "DETaxIdentificationNumber",
              "DEValueAddedNumber",
              "GRNationalIDCard",
              "GRNationalIDV2",
              "GRTaxIdentificationNumber",
              "HKIdentityCardNumber",
              "HUValueAddedNumber",
              "HUPersonalIdentificationNumber",
              "HUTaxIdentificationNumber",
              "INPermanentAccount",
              "INUniqueIdentificationNumber",
              "IDIdentityCardNumber",
              "InternationalBankingAccountNumber",
              "IEPersonalPublicServiceNumber",
              "IEPersonalPublicServiceNumberV2",
              "ILBankAccountNumber",
              "ILNationalID",
              "ITDriversLicenseNumber",
              "ITFiscalCode",
              "ITValueAddedTaxNumber",
              "JPBankAccountNumber",
              "JPDriversLicenseNumber",
              "JPPassportNumber",
              "JPResidentRegistrationNumber",
              "JPSocialInsuranceNumber",
              "JPMyNumberCorporate",
              "JPMyNumberPersonal",
              "JPResidenceCardNumber",
              "LVPersonalCode",
              "LTPersonalCode",
              "LUNationalIdentificationNumberNatural",
              "LUNationalIdentificationNumberNonNatural",
              "MYIdentityCardNumber",
              "MTIdentityCardNumber",
              "MTTaxIDNumber",
              "NLCitizensServiceNumber",
              "NLCitizensServiceNumberV2",
              "NLTaxIdentificationNumber",
              "NLValueAddedTaxNumber",
              "NZBankAccountNumber",
              "NZDriversLicenseNumber",
              "NZInlandRevenueNumber",
              "NZMinistryOfHealthNumber",
              "NZSocialWelfareNumber",
              "NOIdentityNumber",
              "PHUnifiedMultiPurposeIDNumber",
              "PLIdentityCard",
              "PLNationalID",
              "PLNationalIDV2",
              "PLPassportNumber",
              "PLTaxIdentificationNumber",
              "PLREGONNumber",
              "PTCitizenCardNumber",
              "PTCitizenCardNumberV2",
              "PTTaxIdentificationNumber",
              "ROPersonalNumericalCode",
              "RUPassportNumberDomestic",
              "RUPassportNumberInternational",
              "SANationalID",
              "SGNationalRegistrationIdentityCardNumber",
              "SKPersonalNumber",
              "SITaxIdentificationNumber",
              "SIUniqueMasterCitizenNumber",
              "ZAIdentificationNumber",
              "KRResidentRegistrationNumber",
              "ESDNI",
              "ESSocialSecurityNumber",
              "ESTaxIdentificationNumber",
              "SQLServerConnectionString",
              "SENationalID",
              "SENationalIDV2",
              "SEPassportNumber",
              "SETaxIdentificationNumber",
              "SWIFTCode",
              "CHSocialSecurityNumber",
              "TWNationalID",
              "TWPassportNumber",
              "TWResidentCertificate",
              "THPopulationIdentificationCode",
              "TRNationalIdentificationNumber",
              "UKDriversLicenseNumber",
              "UKElectoralRollNumber",
              "UKNationalHealthNumber",
              "UKNationalInsuranceNumber",
              "UKUniqueTaxpayerNumber",
              "USUKPassportNumber",
              "USBankAccountNumber",
              "USDriversLicenseNumber",
              "USIndividualTaxpayerIdentification",
              "USSocialSecurityNumber",
              "UAPassportNumberDomestic",
              "UAPassportNumberInternational",
              "Organization",
              "Email",
              "URL",
              "Age",
              "PhoneNumber",
              "IPAddress",
              "Date",
              "Person",
              "Address",
              "All",
              "Default"
            ]
          }
    

    As you can see that the office is not supported. You can check if your category belongs to the above listed enums and use it accordingly. So, it is not possible to customize this list of entities. As a workaround, for redacting additional custom words or phrases, you might need to implement a post-processing step in your application after receiving output from the PII detection API. Hope this helps.. If you have any follow-up questions, please let me know. I would be happy to help..

    ** Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful