Get started with Document Translation
In this article, you'll learn to use Document Translation with HTTP REST API methods. Document Translation is a cloud-based feature of the Azure Translator service. The Document Translation API enables the translation of whole documents while preserving source document structure and text formatting.
Prerequisites
Note
- Generally, when you create a Cognitive Service resource in the Azure portal, you have the option to create a multi-service subscription key or a single-service subscription key. However, Document Translation is currently supported in the Translator (single-service) resource only, and is not included in the Cognitive Services (multi-service) resource.
- Document Translation is only available in the S1 Standard Service Plan (Pay-as-you-go) or in the D3 Volume Discount Plan. See Cognitive Services pricing—Translator.
To get started, you'll need:
An active Azure account. If you don't have one, you can create a free account.
A single-service Translator resource (not a multi-service Cognitive Services resource).
An Azure blob storage account. You will create containers to store and organize your blob data within your storage account.
Custom domain name and subscription key
Important
- All API requests to the Document Translation service require a custom domain endpoint.
- You won't use the endpoint found on your Azure portal resource Keys and Endpoint page nor the global translator endpoint—
api.cognitive.microsofttranslator.com—to make HTTP requests to Document Translation.
What is the custom domain endpoint?
The custom domain endpoint is a URL formatted with your resource name, hostname, and Translator subdirectories:
https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0
Find your custom domain name
The NAME-OF-YOUR-RESOURCE (also called custom domain name) parameter is the value that you entered in the Name field when you created your Translator resource.
Get your subscription key
Requests to the Translator service require a read-only key for authenticating access.
- If you've created a new resource, after it deploys, select Go to resource. If you have an existing Document Translation resource, navigate directly to your resource page.
- In the left rail, under Resource Management, select Keys and Endpoint.
- Copy and paste your subscription key in a convenient location, such as Microsoft Notepad.
- You'll paste it into the code below to authenticate your request to the Document Translation service.
Create Azure blob storage containers
You'll need to create containers in your Azure blob storage account for source and target files.
- Source container. This container is where you upload your files for translation (required).
- Target container. This container is where your translated files will be stored (required).
Note
Document Translation supports glossaries as blobs in target containers (not separate glossary containers). If want to include a custom glossary, add it to the target container and include the glossaryUrl with the request. If the translation language pair is not present in the glossary, it will not be applied. See Translate documents using a custom glossary
Create SAS access tokens for Document Translation
The sourceUrl , targetUrl , and optional glossaryUrl must include a Shared Access Signature (SAS) token, appended as a query string. The token can be assigned to your container or specific blobs. See Create SAS tokens for Document Translation process.
- Your source container or blob must have designated read and list access.
- Your target container or blob must have designated write and list access.
- Your glossary blob must have designated read and list access.
Tip
- If you're translating multiple files (blobs) in an operation, delegate SAS access at the container level.
- If you're translating a single file (blob) in an operation, delegate SAS access at the blob level.
Document Translation: HTTP requests
A batch Document Translation request is submitted to your Translator service endpoint via a POST request. If successful, the POST method returns a 202 Accepted response code and the batch request is created by the service.
HTTP headers
The following headers are included with each Document Translator API request:
| HTTP header | Description |
|---|---|
| Ocp-Apim-Subscription-Key | Required: The value is the Azure subscription key for your Translator or Cognitive Services resource. |
| Content-Type | Required: Specifies the content type of the payload. Accepted values are application/json or charset=UTF-8. |
| Content-Length | Required: the length of the request body. |
POST request body properties
- The POST request URL is POST
https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0/batches - The POST request body is a JSON object named
inputs. - The
inputsobject contains bothsourceURLandtargetURLcontainer addresses for your source and target language pairs - The
prefixandsuffixfields (optional) are used to filter documents in the container including folders. - A value for the
glossariesfield (optional) is applied when the document is being translated. - The
targetUrlfor each target language must be unique.
Note
If a file with the same name already exists in the destination, the job will fail.
Translate all documents in a container
{
"inputs": [
{
"source": {
"sourceUrl": "https://my.blob.core.windows.net/source-en?sv=2019-12-12&st=2021-03-05T17%3A45%3A25Z&se=2021-03-13T17%3A45%3A00Z&sr=c&sp=rl&sig=SDRPMjE4nfrH3csmKLILkT%2Fv3e0Q6SWpssuuQl1NmfM%3D"
},
"targets": [
{
"targetUrl": "https://my.blob.core.windows.net/target-fr?sv=2019-12-12&st=2021-03-05T17%3A49%3A02Z&se=2021-03-13T17%3A49%3A00Z&sr=c&sp=wdl&sig=Sq%2BYdNbhgbq4hLT0o1UUOsTnQJFU590sWYo4BOhhQhs%3D",
"language": "fr"
}
]
}
]
}
Translate a specific document in a container
- Ensure you have specified "storageType": "File"
- Ensure you have created source URL & SAS token for the specific blob/document (not for the container)
- Ensure you have specified the target filename as part of the target URL – though the SAS token is still for the container.
- Sample request below shows a single document getting translated into two target languages
{
"inputs": [
{
"storageType": "File",
"source": {
"sourceUrl": "https://my.blob.core.windows.net/source-en/source-english.docx?sv=2019-12-12&st=2021-01-26T18%3A30%3A20Z&se=2021-02-05T18%3A30%3A00Z&sr=c&sp=rl&sig=d7PZKyQsIeE6xb%2B1M4Yb56I%2FEEKoNIF65D%2Fs0IFsYcE%3D"
},
"targets": [
{
"targetUrl": "https://my.blob.core.windows.net/target/try/Target-Spanish.docx?sv=2019-12-12&st=2021-01-26T18%3A31%3A11Z&se=2021-02-05T18%3A31%3A00Z&sr=c&sp=wl&sig=AgddSzXLXwHKpGHr7wALt2DGQJHCzNFF%2F3L94JHAWZM%3D",
"language": "es"
},
{
"targetUrl": "https://my.blob.core.windows.net/target/try/Target-German.docx?sv=2019-12-12&st=2021-01-26T18%3A31%3A11Z&se=2021-02-05T18%3A31%3A00Z&sr=c&sp=wl&sig=AgddSzXLXwHKpGHr7wALt2DGQJHCzNFF%2F3L94JHAWZM%3D",
"language": "de"
}
]
}
]
}
Translate documents using a custom glossary
{
"inputs": [
{
"source": {
"sourceUrl": "https://myblob.blob.core.windows.net/source",
"filter": {
"prefix": "myfolder/"
}
},
"targets": [
{
"targetUrl": "https://myblob.blob.core.windows.net/target",
"language": "es",
"glossaries": [
{
"glossaryUrl": "https:// myblob.blob.core.windows.net/glossary/en-es.xlf",
"format": "xliff"
}
]
}
]
}
]
}
Use code to submit Document Translation requests
Set up your coding Platform
- Create a new project.
- Replace Program.cs with the C# code shown below.
- Set your endpoint, subscription key, and container URL values in Program.cs.
- To process JSON data, add Newtonsoft.Json package using .NET CLI.
- Run the program from the project directory.
Important
For the code samples below, you'll hard-code your key and endpoint where indicated; remember to remove the key from your code when you're done, and never post it publicly. See Azure Cognitive Services security for ways to securely store and access your credentials.
You may need to update the following fields, depending upon the operation:
endpointsubscriptionKeysourceURLtargetURLglossaryURLid(job ID)
Locating the id value
- You'll find the job
idin the POST method response HeaderOperation-LocationURL value. The last parameter of the URL is the operation's jobid:
| Response header | Result URL |
|---|---|
| Operation-Location | https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0/batches/9dce0aa9-78dc-41ba-8cae-2e2f3c2ff8ec |
- You can also use a GET Jobs request to retrieve a Document Translation job
id.
Translate documents
using System;
using System.Net.Http;
using System.Threading.Tasks;
using System.Text;
class Program
{
static readonly string route = "/batches";
private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";
private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";
static readonly string json = ("{\"inputs\": [{\"source\": {\"sourceUrl\": \"https://YOUR-SOURCE-URL-WITH-READ-LIST-ACCESS-SAS\",\"storageSource\": \"AzureBlob\",\"language\": \"en\",\"filter\":{\"prefix\": \"Demo_1/\"} }, \"targets\": [{\"targetUrl\": \"https://YOUR-TARGET-URL-WITH-WRITE-LIST-ACCESS-SAS\",\"storageSource\": \"AzureBlob\",\"category\": \"general\",\"language\": \"es\"}]}]}");
static async Task Main(string[] args)
{
using HttpClient client = new HttpClient();
using HttpRequestMessage request = new HttpRequestMessage();
{
StringContent content = new StringContent(json, Encoding.UTF8, "application/json");
request.Method = HttpMethod.Post;
request.RequestUri = new Uri(endpoint + route);
request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
request.Content = content;
HttpResponseMessage response = await client.SendAsync(request);
string result = response.Content.ReadAsStringAsync().Result;
if (response.IsSuccessStatusCode)
{
Console.WriteLine($"Status code: {response.StatusCode}");
Console.WriteLine();
Console.WriteLine($"Response Headers:");
Console.WriteLine(response.Headers);
}
else
Console.Write("Error");
}
}
}
}
Get file formats
Retrieve a list of supported file formats. If successful, this method returns a 200 OK response code.
using System;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";
static readonly string route = "/documents/formats";
private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";
static async Task Main(string[] args)
{
HttpClient client = new HttpClient();
using HttpRequestMessage request = new HttpRequestMessage();
{
request.Method = HttpMethod.Get;
request.RequestUri = new Uri(endpoint + route);
request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
HttpResponseMessage response = await client.SendAsync(request);
string result = response.Content.ReadAsStringAsync().Result;
Console.WriteLine($"Status code: {response.StatusCode}");
Console.WriteLine($"Response Headers: {response.Headers}");
Console.WriteLine();
Console.WriteLine(result);
}
}
Get job status
Get the current status for a single job and a summary of all jobs in a Document Translation request. If successful, this method returns a 200 OK response code.
using System;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";
static readonly string route = "/batches/{id}";
private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";
static async Task Main(string[] args)
{
HttpClient client = new HttpClient();
using HttpRequestMessage request = new HttpRequestMessage();
{
request.Method = HttpMethod.Get;
request.RequestUri = new Uri(endpoint + route);
request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
HttpResponseMessage response = await client.SendAsync(request);
string result = response.Content.ReadAsStringAsync().Result;
Console.WriteLine($"Status code: {response.StatusCode}");
Console.WriteLine($"Response Headers: {response.Headers}");
Console.WriteLine();
Console.WriteLine(result);
}
}
Get document status
Brief overview
Retrieve the status of a specific document in a Document Translation request. If successful, this method returns a 200 OK response code.
using System;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";
static readonly string route = "/{id}/document/{documentId}";
private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";
static async Task Main(string[] args)
{
HttpClient client = new HttpClient();
using HttpRequestMessage request = new HttpRequestMessage();
{
request.Method = HttpMethod.Get;
request.RequestUri = new Uri(endpoint + route);
request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
HttpResponseMessage response = await client.SendAsync(request);
string result = response.Content.ReadAsStringAsync().Result;
Console.WriteLine($"Status code: {response.StatusCode}");
Console.WriteLine($"Response Headers: {response.Headers}");
Console.WriteLine();
Console.WriteLine(result);
}
}
Delete job
Brief overview
Cancel currently processing or queued job. Only documents for which translation hasn't started will be canceled.
using System;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
private static readonly string endpoint = "https://<NAME-OF-YOUR-RESOURCE>.cognitiveservices.azure.com/translator/text/batch/v1.0";
static readonly string route = "/batches/{id}";
private static readonly string subscriptionKey = "<YOUR-SUBSCRIPTION-KEY>";
static async Task Main(string[] args)
{
HttpClient client = new HttpClient();
using HttpRequestMessage request = new HttpRequestMessage();
{
request.Method = HttpMethod.Delete;
request.RequestUri = new Uri(endpoint + route);
request.Headers.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
HttpResponseMessage response = await client.SendAsync(request);
string result = response.Content.ReadAsStringAsync().Result;
Console.WriteLine($"Status code: {response.StatusCode}");
Console.WriteLine($"Response Headers: {response.Headers}");
Console.WriteLine();
Console.WriteLine(result);
}
}
Content limits
The table below lists the limits for data that you send to Document Translation.
| Attribute | Limit |
|---|---|
| Document size | ≤ 40 MB |
| Total number of files. | ≤ 1000 |
| Total content size in a batch | ≤ 250 MB |
| Number of target languages in a batch | ≤ 10 |
| Size of Translation memory file | ≤ 10 MB |
Document Translation can not be used to translate secured documents such as those with an encrypted password or with restricted access to copy content.
Troubleshooting
Common HTTP status codes
| HTTP status code | Description | Possible reason |
|---|---|---|
| 200 | OK | The request was successful. |
| 400 | Bad Request | A required parameter is missing, empty, or null. Or, the value passed to either a required or optional parameter is invalid. A common issue is a header that is too long. |
| 401 | Unauthorized | The request is not authorized. Check to make sure your subscription key or token is valid and in the correct region. When managing your subscription on the Azure portal, please ensure you're using the Translator single-service resource not the Cognitive Services multi-service resource. |
| 429 | Too Many Requests | You have exceeded the quota or rate of requests allowed for your subscription. |
| 502 | Bad Gateway | Network or server-side issue. May also indicate invalid headers. |