Issue while executing REST API for Azure Purview

Manuel Bustamante 42 Reputation points
2022-06-21T12:50:27.207+00:00

Hi,

We have an Azure Purview Account and a Synapse workspace deployed on Azure portal.

I'm using Synapse to execute a REST API:
213432-image.png

I managed to get a token using the SPN recently created but unfortunately I'm getting the error "Internal server error".

What I have already checked:

  1. The SPN was recently created (as mentioned in this post https://learn.microsoft.com/en-us/answers/questions/861441/issue-while-executing-rest-api-for-azure-purview.html)
  2. I gave the SPN the following roles within Purview: Collection admin, Data source admin, Data curator, and Insight reader.

If it can help, I also used pyapacheatlas (https://github.com/wjohnson/pyapacheatlas/tree/master/samples) in another notebook of the same Synapse account to connect to Purview using the same SPN and I didnt have any problem. I'm obliged to use now the REST API because I'm trying to get insights from Purview, and this it can only be achieved with the REST API.

Thank you in advance,

Kind regards,

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,402 questions
Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
947 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Manuel Bustamante 42 Reputation points
    2022-06-22T17:05:20.767+00:00

    Hi @PRADEEPCHEEKATLA-MSFT ,

    Thank you for your answer.

    Both pieces of code uses the same SPN.

    This version that uses pyapacheatlas works correctly. This version get all data assets based on a criteria search.

    from azure.identity import AzureCliCredential  
    from pyapacheatlas.core import PurviewClient  
    from pyapacheatlas.auth import ServicePrincipalAuthentication  
    from pyapacheatlas.core import PurviewClient  
    import json  
      
      
    cred = AzureCliCredential()  
      
    # Create a client to connect to your service.  
    client = PurviewClient(  
        account_name = "<purview account name>",  
        authentication = cred  
    )  
      
      
      
    auth = ServicePrincipalAuthentication(  
        tenant_id = "<tenant id>",   
        client_id = "<client id>",   
        client_secret = "<client secret>"  
    )  
      
    # Create a client to connect to your service.  
    client = PurviewClient(  
        account_name = "<purview account name>",  
        authentication = auth  
    )  
      
      
    search = client.discovery.search_entities('name:*')  
      
    for entity in search:  
        print(json.dumps(entity, indent=1))  
    

    The version that uses the REST API does not work. This version gets Purview insights about the top files found.

    import os  
    import requests  
    import json  
    import jmespath  
    import pandas as pd  
    from pprint import pprint  
      
    def azuread_auth(tenant_id: str, client_id: str, client_secret: str, resource_url: str):  
        """  
        Authenticates Service Principal to the provided Resource URL, and returns the OAuth Access Token  
        """  
        url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/token"  
        payload= f'grant_type=client_credentials&client_id={client_id}&client_secret={client_secret}&resource={resource_url}'  
        headers = {  
        'Content-Type': 'application/x-www-form-urlencoded'  
        }  
        response = requests.request("POST", url, headers=headers, data=payload)  
        access_token = json.loads(response.text)['access_token']  
        return access_token  
      
          
      
    # ==========  
      
    # Service Principal with "Purview Data Source Administrator" permissions on Purview  
    tenant_id = "<tenant id>"  
    client_id = "<client id>"  
    client_secret = "<client secret>"  
    resource_url = "https://purview.azure.net"  
    data_catalog_name = "<purview account name"  
      
      
      
    # Retrieve authentication objects  
    azuread_access_token = azuread_auth(tenant_id, client_id, client_secret, resource_url)  
      
    # ==========  
      
    url = f"https://{data_catalog_name}.guardian.purview.azure.com/reports/fileExtensions"  
      
    headers = {  
                'Authorization': f'Bearer {azuread_access_token}',  
                'Content-Type': 'application/json'  
                }  
      
    payload="""{  
                    "Query":{  
                        "StartTime":"2020-01-01T00:00:00.000Z",  
                        "EndTime":"2022-12-31T23:59:00.000Z",  
                        "takeTopCount":30,  
                        "assetTypes":[  
                        ]  
                    }  
                }  
            """  
      
    response = json.loads((requests.request("POST", url, headers=headers, data=payload)).text)  
      
    data = jmespath.search("fileExtensionDetails[].[fileExtension, assets, subscriptions, count]", response)  
    df = pd.DataFrame(data, columns=['fileExtension', 'assets', 'subscriptions', 'count'],dtype=float)  
      
    return df  
    

  2. PRADEEPCHEEKATLA-MSFT 77,901 Reputation points Microsoft Employee
    2022-06-30T09:42:17.047+00:00

    Hello @Manuel Bustamante ,

    Here is the suggestion from the Product Team:

    Insights APIs are not public yet. We plan to do that in the coming months.

    As you are trying to run Insights APIs. I don't think the Insights APIs are documented in public for public use yet, so I wouldn't recommend you to use them before they are documented for public use.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments