Using the Azure Translator Document Translation REST API we are unable to translate any documents other than PDF.
Submitting any other file type, e.g. docx, pptx, xlsx, etc. results in the below error:
{
"id": "aacd3925-7229-432a-a068-0736803d63f6",
"createdDateTimeUtc": "2022-04-20T15:50:21.5790466Z",
"lastActionDateTimeUtc": "2022-04-20T15:50:27.4578543Z",
"status": "Failed",
"error": {
"code": "InvalidRequest",
"message": "Document failed during checking validity. This may be caused by corruption or unsupported type/extension.File contains corrupted data.",
"target": "Operation",
"innerError": {
"code": "InvalidDocument",
"message": "Document failed during checking validity. This may be caused by corruption or unsupported type/extension.File contains corrupted data."
}
},
"summary": {
"total": 1,
"failed": 1,
"success": 0,
"inProgress": 0,
"notYetStarted": 0,
"cancelled": 0,
"totalCharacterCharged": 0
}
}
We are using managed identity to authenticate with the API and are translating single files at a time, so our request to the service is just simply (file names and target language vary, of course):
{
"inputs": [
{
"storageType": "File",
"source": {
"sourceUrl": "https://ethoshub.blob.core.windows.net/source/Test doc.docx"
},
"targets": [
{
"targetUrl": "https://ethoshub.blob.core.windows.net/target/Test doc-de.docx",
"language": "de"
}
]
}
]
}
There doesn't appear to be anything in the docs which suggests you need to do anything different for different file types and the file types we are submitting are listed as supported in the documentation, but only PDFs are successfully translated.