How to summarize sentiment for a long text with Text Analytics sentiment API?

Question

With Azure Text Analytics sentiment API, we're able to label sentiment for a text piece. However, for a long piece of text that exceeds the 5120 limit, we have to split it into chunks. My question is after getting the sentiment labels for individual chunks, what's the best way to aggregate them to get the overall sentiment label for the whole text?

Currently, I see two approaches. 1) take the average of the "chunk" sentiment; 2) take the average of every sentence. Which will be the more appropriate way?

Thank you.

Currently, I see two approaches. 1) take the average of the "chunk" sentiment; 2) take the average of every sentence. Which will be the more appropriate way?

Thank you.

Answer

@Yu Zhu V3 of Sentiment Analysis, It can now detect the sentiment of both a document level and its individual sentences.
https://learn.microsoft.com/en-us/azure/cognitive-services/text-analytics/how-tos/text-analytics-how-to-sentiment-analysis?tabs=version-3
https://learn.microsoft.com/en-us/azure/cognitive-services/text-analytics/tutorials/tutorial-power-bi-key-phrases
You wouldn’t have to break them into sentences. The API does it for you. You can also ignore the sentence level sentiment and only use the document level one. See example below:

Input: you can pass in 1000 documents (i.e. comments) in a single API call

{  
  "documents": [  
    {  
      "language": "en",  
      "id": "1",  
      "text": "This program is very helpful. I received a great response to my inquiry. I was disappointed with the response time however."  
    },  
     {  
      "language": "en",  
      "id": "2",  
      "text": "I really enjoyed the interaction with the agent."  
    }  
  ]  
}  
  
Output: (get document and sentence scores)  
  
{  
    "documents": [  
        {  
            "id": "1",  
            "sentiment": "mixed",  
            "documentScores": {  
                "positive": 0.66642224788665771,  
                "neutral": 0.0001706598413875,  
                "negative": 0.333407074213028  
            },  
            "sentences": [  
                {  
                    "sentiment": "positive",  
                    "sentenceScores": {  
                        "positive": 0.99993038177490234,  
                        "neutral": 2.86401773337E-05,  
                        "negative": 4.09220410802E-05  
                    },  
                    "offset": 0,  
                    "length": 29  
                },  
                {  
                    "sentiment": "positive",  
                    "sentenceScores": {  
                        "positive": 0.99930989742279053,  
                        "neutral": 0.0004785652272403,  
                        "negative": 0.0002115316892741  
                    },  
                    "offset": 30,  
                    "length": 42  
                },  
                {  
                    "sentiment": "negative",  
                    "sentenceScores": {  
                        "positive": 2.64735208475E-05,  
                        "neutral": 4.7741514209E-06,  
                        "negative": 0.9999687671661377  
                    },  
                    "offset": 73,  
                    "length": 50  
                }  
            ]  
        },  
        {  
            "id": "2",  
            "sentiment": "positive",  
            "documentScores": {  
                "positive": 0.99982720613479614,  
                "neutral": 9.10629023565E-05,  
                "negative": 8.17300679046E-05  
            },  
            "sentences": [  
                {  
                    "sentiment": "positive",  
                    "sentenceScores": {  
                        "positive": 0.99982720613479614,  
                        "neutral": 9.10629023565E-05,  
                        "negative": 8.17300679046E-05  
                    },  
                    "offset": 0,  
                    "length": 48  
                }  
            ]  
        }  
    ],  
    "errors": []  
}

How to summarize sentiment for a long text with Text Analytics sentiment API?

1 answer