What is language detection in Azure AI Language?

Article
02/12/2024

Language detection is one of the features offered by Azure AI Language, a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. Language detection is able to detect more than 100 languages in their primary script. In addition, it offers script detection to detect multiple scripts per language according to the ISO 15924 standard for a select number of languages.

This documentation contains the following types of articles:

Quickstarts are getting-started instructions to guide you through making requests to the service.
How-to guides contain instructions for using the service in more specific or customized ways.

Language detection features

Language detection: Returns one predominant language for each document you submit, along with its ISO 639-1 name, a human-readable name, confidence score, script name and script code according to ISO 15924 standard.
Script detection: To distinguish between multiple scripts used to write certain languages, such as Kazakh, language detection returns a script name and script code according to the ISO 15924 standard.
Ambiguous content handling: To help disambiguate language based on the input, you can specify an ISO 3166-1 alpha-2 country/region code. For example, the word "communication" is common to both English and French. Specifying the origin of the text as France can help the language detection model determine the correct language.

Typical workflow

To use this feature, you submit data for analysis and handle the API output in your application. Analysis is performed as-is, with no added customization to the model used on your data.

Create an Azure AI Language resource, which grants you access to the features offered by Azure AI Language. It generates a password (called a key) and an endpoint URL that you use to authenticate API requests.
Create a request using either the REST API or the client library for C#, Java, JavaScript, and Python. You can also send asynchronous calls with a batch request to combine API requests for multiple features into a single call.
Send the request containing your text data. Your key and endpoint are used for authentication.
Stream or store the response locally.

Get started with language detection

To use language detection, you submit raw unstructured text for analysis and handle the API output in your application. Analysis is performed as-is, with no additional customization to the model used on your data. There are two ways to use language detection:

Development option	Description
Language studio	Language Studio is a web-based platform that lets you try entity linking with text examples without an Azure account, and your own data when you sign up. For more information, see the Language Studio website or language studio quickstart.
REST API or Client library (Azure SDK)	Integrate language detection into your applications using the REST API, or the client library available in a variety of languages. For more information, see the language detection quickstart.
Docker container	Use the available Docker container to deploy this feature on-premises. These docker containers enable you to bring the service closer to your data for compliance, security, or other operational reasons.

Responsible AI

An AI system includes not only the technology, but also the people who will use it, the people who will be affected by it, and the environment in which it's deployed. Read the transparency note for language detection to learn about responsible AI use and deployment in your systems. You can also see the following articles for more information:

Next steps

There are two ways to get started using the entity linking feature:

Language Studio, which is a web-based platform that enables you to try several Azure AI Language features without needing to write code.
The quickstart article for instructions on making requests to the service using the REST API and client library SDK.