Get started with Document Translation

In this article, you'll learn to use Document Translation with HTTP REST API methods. Document Translation is a cloud-based feature of the Azure Translator service. The Document Translation API enables the translation of whole documents while preserving source document structure and text formatting.

Prerequisites

Note

  1. Generally, when you create a Cognitive Service resource in the Azure portal, you have the option to create a multi-service subscription key or a single-service subscription key. However, Document Translation is currently supported in the Translator (single-service) resource only, and is not included in the Cognitive Services (multi-service) resource.
  2. Document Translation is only available in the S1 Standard Service Plan (Pay-as-you-go) or in the D3 Volume Discount Plan. See Cognitive Services pricing—Translator.

To get started, you'll need:

Custom domain name and subscription key

Important

  • All API requests to the Document Translation service require a custom domain endpoint.
  • You won't use the endpoint found on your Azure portal resource Keys and Endpoint page nor the global translator endpoint—api.cognitive.microsofttranslator.com—to make HTTP requests to Document Translation.

What is the custom domain endpoint?

The custom domain endpoint is a URL formatted with your resource name, hostname, and Translator subdirectories:

https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0

Find your custom domain name

The NAME-OF-YOUR-RESOURCE (also called custom domain name) parameter is the value that you entered in the Name field when you created your Translator resource.

Image of the Azure portal, create resource, instant details, name field.

Get your subscription key

Requests to the Translator service require a read-only key for authenticating access.

  1. If you've created a new resource, after it deploys, select Go to resource. If you have an existing Document Translation resource, navigate directly to your resource page.
  2. In the left rail, under Resource Management, select Keys and Endpoint.
  3. Copy and paste your subscription key in a convenient location, such as Microsoft Notepad.
  4. You'll paste it into the code below to authenticate your request to the Document Translation service.

Image of the get your subscription key field in Azure portal.

Create Azure blob storage containers

You'll need to create containers in your Azure blob storage account for source and target files.

  • Source container. This container is where you upload your files for translation (required).
  • Target container. This container is where your translated files will be stored (required).

Note

Document Translation supports glossaries as blobs in target containers (not separate glossary containers). If want to include a custom glossary, add it to the target container and include the glossaryUrl with the request. If the translation language pair is not present in the glossary, it will not be applied. See Translate documents using a custom glossary

Create SAS access tokens for Document Translation

The sourceUrl , targetUrl , and optional glossaryUrl must include a Shared Access Signature (SAS) token, appended as a query string. The token can be assigned to your container or specific blobs. See Create SAS tokens for Document Translation process.

  • Your source container or blob must have designated read and list access.
  • Your target container or blob must have designated write and list access.
  • Your glossary blob must have designated read and list access.

Tip

  • If you're translating multiple files (blobs) in an operation, delegate SAS access at the container level.
  • If you're translating a single file (blob) in an operation, delegate SAS access at the blob level.

Document Translation: HTTP requests

A batch Document Translation request is submitted to your Translator service endpoint via a POST request. If successful, the POST method returns a 202 Accepted response code and the batch request is created by the service.

HTTP headers

The following headers are included with each Document Translator API request:

HTTP header Description
Ocp-Apim-Subscription-Key Required: The value is the Azure subscription key for your Translator or Cognitive Services resource.
Content-Type Required: Specifies the content type of the payload. Accepted values are application/json or charset=UTF-8.
Content-Length Required: the length of the request body.

POST request body properties

  • The POST request URL is POST https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0/batches
  • The POST request body is a JSON object named inputs.
  • The inputs object contains both sourceURL and targetURL container addresses for your source and target language pairs
  • The prefix and suffix fields (optional) are used to filter documents in the container including folders.
  • A value for the glossaries field (optional) is applied when the document is being translated.
  • The targetUrl for each target language must be unique.

Note

If a file with the same name already exists in the destination, it will be overwritten.

Translate all documents in a container

{
    "inputs": [
        {
            "source": {
                "sourceUrl": "https://my.blob.core.windows.net/source-en?sv=2019-12-12&st=2021-03-05T17%3A45%3A25Z&se=2021-03-13T17%3A45%3A00Z&sr=c&sp=rl&sig=SDRPMjE4nfrH3csmKLILkT%2Fv3e0Q6SWpssuuQl1NmfM%3D"
            },
            "targets": [
                {
                    "targetUrl": "https://my.blob.core.windows.net/target-fr?sv=2019-12-12&st=2021-03-05T17%3A49%3A02Z&se=2021-03-13T17%3A49%3A00Z&sr=c&sp=wdl&sig=Sq%2BYdNbhgbq4hLT0o1UUOsTnQJFU590sWYo4BOhhQhs%3D",
                    "language": "fr"
                }
            ]
        }
    ]
}

Translate a specific document in a container

  • Ensure you have specified "storageType": "File"
  • Ensure you have created source URL & SAS token for the specific blob/document (not for the container)
  • Ensure you have specified the target filename as part of the target URL – though the SAS token is still for the container.
  • Sample request below shows a single document getting translated into two target languages
{
    "inputs": [
        {
            "storageType": "File",
            "source": {
                "sourceUrl": "https://my.blob.core.windows.net/source-en/source-english.docx?sv=2019-12-12&st=2021-01-26T18%3A30%3A20Z&se=2021-02-05T18%3A30%3A00Z&sr=c&sp=rl&sig=d7PZKyQsIeE6xb%2B1M4Yb56I%2FEEKoNIF65D%2Fs0IFsYcE%3D"
            },
            "targets": [
                {
                    "targetUrl": "https://my.blob.core.windows.net/target/try/Target-Spanish.docx?sv=2019-12-12&st=2021-01-26T18%3A31%3A11Z&se=2021-02-05T18%3A31%3A00Z&sr=c&sp=wl&sig=AgddSzXLXwHKpGHr7wALt2DGQJHCzNFF%2F3L94JHAWZM%3D",
                    "language": "es"
                },
                {
                    "targetUrl": "https://my.blob.core.windows.net/target/try/Target-German.docx?sv=2019-12-12&st=2021-01-26T18%3A31%3A11Z&se=2021-02-05T18%3A31%3A00Z&sr=c&sp=wl&sig=AgddSzXLXwHKpGHr7wALt2DGQJHCzNFF%2F3L94JHAWZM%3D",
                    "language": "de"
                }
            ]
        }
    ]
}

Translate documents using a custom glossary

{
    "inputs": [
        {
            "source": {
                "sourceUrl": "https://myblob.blob.core.windows.net/source",
                "filter": {
                    "prefix": "myfolder/"
                }
            },
            "targets": [
                {
                    "targetUrl": "https://myblob.blob.core.windows.net/target",
                    "language": "es",
                    "glossaries": [
                        {
                            "glossaryUrl": "https:// myblob.blob.core.windows.net/glossary/en-es.xlf",
                            "format": "xliff"
                        }
                    ]
                }
            ]
        }
    ]
}

Use code to submit Document Translation requests

Set up your coding Platform

  • Create a new project.
  • Replace Program.cs with the C# code shown below.
  • Set your endpoint, subscription key, and container URL values in Program.cs.
  • To process JSON data, add Newtonsoft.Json package using .NET CLI.
  • Run the program from the project directory.

Important

For the code samples below, you'll hard-code your key and endpoint where indicated; remember to remove the key from your code when you're done, and never post it publicly. See Azure Cognitive Services security for ways to securely store and access your credentials.

You may need to update the following fields, depending upon the operation:

  • endpoint
  • subscriptionKey
  • sourceURL
  • targetURL
  • glossaryURL
  • id (job ID)

Locating the id value

  • You'll find the job id in the POST method response Header Operation-Location URL value. The last parameter of the URL is the operation's job id:
Response header Result URL
Operation-Location https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0/batches/9dce0aa9-78dc-41ba-8cae-2e2f3c2ff8ec
  • You can also use a GET Jobs request to retrieve a Document Translation job id .

Translate documents


    using System;
    using System.Net.Http;
    using System.Threading.Tasks;
    using System.Text;


    class Program
    {

        static readonly string route = "/batches";

        private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";

        private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";

        static readonly string json = ("{\"inputs\": [{\"source\": {\"sourceUrl\": \"https://YOUR-SOURCE-URL-WITH-READ-LIST-ACCESS-SAS\",\"storageSource\": \"AzureBlob\",\"language\": \"en\",\"filter\":{\"prefix\": \"Demo_1/\"} }, \"targets\": [{\"targetUrl\": \"https://YOUR-TARGET-URL-WITH-WRITE-LIST-ACCESS-SAS\",\"storageSource\": \"AzureBlob\",\"category\": \"general\",\"language\": \"es\"}]}]}");

        static async Task Main(string[] args)
        {
            using HttpClient client = new HttpClient();
            using HttpRequestMessage request = new HttpRequestMessage();
            {

                StringContent content = new StringContent(json, Encoding.UTF8, "application/json");

                request.Method = HttpMethod.Post;
                request.RequestUri = new Uri(endpoint + route);
                request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
                request.Content = content;

                HttpResponseMessage  response = await client.SendAsync(request);
                string result = response.Content.ReadAsStringAsync().Result;
                if (response.IsSuccessStatusCode)
                {
                    Console.WriteLine($"Status code: {response.StatusCode}");
                    Console.WriteLine();
                    Console.WriteLine($"Response Headers:");
                    Console.WriteLine(response.Headers);
                }
                else
                    Console.Write("Error");

            }

        }

    }
}

Get file formats

Retrieve a list of supported file formats. If successful, this method returns a 200 OK response code.


using System;
using System.Net.Http;
using System.Threading.Tasks;


class Program
{


    private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";

    static readonly string route = "/documents/formats";

    private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";

    static async Task Main(string[] args)
    {

        HttpClient client = new HttpClient();
            using HttpRequestMessage request = new HttpRequestMessage();
            {
                request.Method = HttpMethod.Get;
                request.RequestUri = new Uri(endpoint + route);
                request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);


                HttpResponseMessage response = await client.SendAsync(request);
                string result = response.Content.ReadAsStringAsync().Result;

                Console.WriteLine($"Status code: {response.StatusCode}");
                Console.WriteLine($"Response Headers: {response.Headers}");
                Console.WriteLine();
                Console.WriteLine(result);
            }
}

Get job status

Get the current status for a single job and a summary of all jobs in a Document Translation request. If successful, this method returns a 200 OK response code.


using System;
using System.Net.Http;
using System.Threading.Tasks;


class Program
{


    private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";

    static readonly string route = "/batches/{id}";

    private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";

    static async Task Main(string[] args)
    {

        HttpClient client = new HttpClient();
            using HttpRequestMessage request = new HttpRequestMessage();
            {
                request.Method = HttpMethod.Get;
                request.RequestUri = new Uri(endpoint + route);
                request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);


                HttpResponseMessage response = await client.SendAsync(request);
                string result = response.Content.ReadAsStringAsync().Result;

                Console.WriteLine($"Status code: {response.StatusCode}");
                Console.WriteLine($"Response Headers: {response.Headers}");
                Console.WriteLine();
                Console.WriteLine(result);
            }
}

Get document status

Brief overview

Retrieve the status of a specific document in a Document Translation request. If successful, this method returns a 200 OK response code.


using System;
using System.Net.Http;
using System.Threading.Tasks;


class Program
{


    private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";

    static readonly string route = "/{id}/document/{documentId}";

    private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";

    static async Task Main(string[] args)
    {

        HttpClient client = new HttpClient();
            using HttpRequestMessage request = new HttpRequestMessage();
            {
                request.Method = HttpMethod.Get;
                request.RequestUri = new Uri(endpoint + route);
                request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);


                HttpResponseMessage response = await client.SendAsync(request);
                string result = response.Content.ReadAsStringAsync().Result;

                Console.WriteLine($"Status code: {response.StatusCode}");
                Console.WriteLine($"Response Headers: {response.Headers}");
                Console.WriteLine();
                Console.WriteLine(result);
            }
}

Delete job

Brief overview

Cancel currently processing or queued job. Only documents for which translation hasn't started will be canceled.


using System;
using System.Net.Http;
using System.Threading.Tasks;


class Program
{


    private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";

    static readonly string route = "/batches/{id}";

    private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";

    static async Task Main(string[] args)
    {

        HttpClient client = new HttpClient();
            using HttpRequestMessage request = new HttpRequestMessage();
            {
                request.Method = HttpMethod.Delete;
                request.RequestUri = new Uri(endpoint + route);
                request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);


                HttpResponseMessage response = await client.SendAsync(request);
                string result = response.Content.ReadAsStringAsync().Result;

                Console.WriteLine($"Status code: {response.StatusCode}");
                Console.WriteLine($"Response Headers: {response.Headers}");
                Console.WriteLine();
                Console.WriteLine(result);
            }
}

Content limits

The table below lists the limits for data that you send to Document Translation.

Attribute Limit
Document size ≤ 40 MB
Total number of files. ≤ 1000
Total content size in a batch ≤ 250 MB
Number of target languages in a batch ≤ 10
Size of Translation memory file ≤ 10 MB

Document Translation can not be used to translate secured documents such as those with an encrypted password or with restricted access to copy content.

Troubleshooting

Common HTTP status codes

HTTP status code Description Possible reason
200 OK The request was successful.
400 Bad Request A required parameter is missing, empty, or null. Or, the value passed to either a required or optional parameter is invalid. A common issue is a header that is too long.
401 Unauthorized The request is not authorized. Check to make sure your subscription key or token is valid and in the correct region. When managing your subscription on the Azure portal, please ensure you're using the Translator single-service resource not the Cognitive Services multi-service resource.
429 Too Many Requests You have exceeded the quota or rate of requests allowed for your subscription.
502 Bad Gateway Network or server-side issue. May also indicate invalid headers.

Learn more

Next steps