Language service- Entity extraction for files stored in file storage

Harish A 50 Reputation points
2024-04-23T14:27:40.7766667+00:00

Hi

I am trying to use Language Services to extract entities.

I have seen the python code samples, where program takes the text content for which entities are extracted.

However, I have dump of text files from which entities have to be extracted. Using above code , I can not keep sending the file content of each file to get the response.

My expectation is I upload these files in the file storage and provide the link of the folder of documents to the python code.

This way, I can rely on the folder for any new documents uploaded to files storage.

How to achieve this. Can you help me on this.

I read through following documentation, but did not find required answer.

https://learn.microsoft.com/en-us/azure/ai-services/language-service/named-entity-recognition/overview

Regards

Harish Alwala

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
358 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 15,596 Reputation points
    2024-04-23T19:23:10.5866667+00:00

    You can use the Azure AI Language Services along with Azure Functions or Azure Logic Apps to create a serverless pipeline. My idea here is that you will be able to process files automatically when they are uploaded to the storage.

    Here is the recipe :

    • You need to set up an Azure Blob Storage where you can upload your text files (consider it your storage system where all files are maintained)
    • An Azure Function can be triggered whenever a new file is uploaded to a specified Blob Storage container.
    • Within the Azure Function, use the Azure AI Language Services to perform entity extraction on each file's content. You will need to use the SDK provided by Azure for Language Services and specifically the Named Entity Recognition (NER) feature.

    Try this example the Azure Function in Python with the Azure AI Language Service for entity extraction:

    
    import azure.functions as func
    
    from azure.ai.textanalytics import TextAnalyticsClient
    
    from azure.core.credentials import AzureKeyCredential
    
    def main(blob: func.InputStream):
    
    ient
    
        key = "your_language_service_key"
    
        endpoint = "your_language_service_endpoint"
    
        text_analytics_client = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
    
    blob
    
        document = blob.read().decode('utf-8')
    
        # Extract entities
    
        poller = text_analytics_client.begin_recognize_entities([document])
    
        result = poller.result()
    
    
    
        for idx, doc in enumerate(result):
    
            if not doc.is_error:
    
                for entity in doc.entities:
    
                    print(f"Entity: {entity.text}, Category: {entity.category}, Confidence Score: {entity.confidence_score}")
    
            else:
    
                print(f"Error: {doc.error}")
    
    
    

    Try and tell us :)