Form Recognizer [DEPRECATED]

Extracts information from forms and images into structured data based on a model created by a set of representative training forms.
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Logic Apps | Standard | All Logic Apps regions except the following: - Azure Government regions - Azure China regions |
Power Automate | Standard | All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet |
Power Apps | Standard | All Power Apps regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet |
Contact | |
---|---|
Name | Microsoft |
URL | Microsoft LogicApps Support Microsoft Power Automate Support Microsoft Power Apps Support |
Connector Metadata | |
---|---|
Publisher | Microsoft |
Website | https://azure.microsoft.com/services/cognitive-services/form-recognizer/ |
Deprecation Information
The Form Recognizer connector has been deprecated due to the upcoming retirement of the API. Current users of the API were notified of the upcoming retirement and given quickstart instructions to migrate to V2.
Creating a connection
The connector supports the following authentication types:
Default | Required parameters for creating connection. | All regions |
Default
Applicable: All regions
Required parameters for creating connection.
Name | Type | Description |
---|---|---|
Account Key | securestring | Cognitive Services Account Key |
Site URL | string | Endpoint Url (Example: https://westeurope.api.cognitive.microsoft.com). If not specified Url will default to 'https://westeurope.api.cognitive.microsoft.com'. |
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 100 | 60 seconds |
Actions
Analyze Form [DEPRECATED] |
The document to analyze must be of a supported content type - 'application/pdf', 'image/jpeg' or 'image/png'. The response contains not just the extracted information of the analyzed form but also information about content that was not extracted along with a reason. |
Delete Model [DEPRECATED] |
Delete model artifacts. |
Get Keys [DEPRECATED] |
Use the API to retrieve the keys that were extracted by the specified model. |
Get Model [DEPRECATED] |
Get information about a model. |
Get Models [DEPRECATED] |
Get information about all trained models |
Train Model [DEPRECATED] |
The train request must include a source parameter that is either an externally accessible Azure Storage blob container Uri (preferably a Shared Access Signature Uri) or valid path to a data folder in a locally mounted drive. When local paths are specified, they must follow the Linux/Unix path format and be an absolute path rooted to the input mount configuration setting value e.g., if '{Mounts:Input}' configuration setting value is '/input' then a valid source path would be '/input/contosodataset'. All data to be trained are expected to be under the source. Models are trained using documents that are of the following content type - 'application/pdf', 'image/jpeg' and 'image/png'." Other content is ignored when training a model. |
Analyze Form [DEPRECATED]
The document to analyze must be of a supported content type - 'application/pdf', 'image/jpeg' or 'image/png'. The response contains not just the extracted information of the analyzed form but also information about content that was not extracted along with a reason.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Model ID
|
modelId | True | string |
This is your Model Identifier that is used to analyze the document. |
Keys to extract
|
Keys | string |
An optional list of known keys to extract the values for. |
|
Document
|
Document | True | binary |
A PDF document or image (JPG or PNG) file to analyze. |
Content type
|
Content-type | string |
Content type of the document to analyze. |
Returns
Analyze API call result.
- Body
- AnalyzeResult
Delete Model [DEPRECATED]
Delete model artifacts.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Model ID
|
modelId | True | string |
The identifier of the model to delete. |
Get Keys [DEPRECATED]
Use the API to retrieve the keys that were extracted by the specified model.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Model ID
|
modelId | True | string |
Model identifier. |
Returns
Result of an operation to get the keys extracted by a model.
- Body
- KeysResult
Get Model [DEPRECATED]
Get information about a model.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Model ID
|
modelId | True | string |
This is your Model Identifier that is used to analyze your documents with. |
Returns
Result of a model status query operation.
- Body
- ModelResult
Get Models [DEPRECATED]
Get information about all trained models
Returns
Result of query operation to fetch multiple models.
- Body
- ModelsResult
Train Model [DEPRECATED]
The train request must include a source parameter that is either an externally accessible Azure Storage blob container Uri (preferably a Shared Access Signature Uri) or valid path to a data folder in a locally mounted drive. When local paths are specified, they must follow the Linux/Unix path format and be an absolute path rooted to the input mount configuration setting value e.g., if '{Mounts:Input}' configuration setting value is '/input' then a valid source path would be '/input/contosodataset'. All data to be trained are expected to be under the source. Models are trained using documents that are of the following content type - 'application/pdf', 'image/jpeg' and 'image/png'." Other content is ignored when training a model.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
source
|
source | True | string |
Get or set source path. |
Returns
Response of the Train API call.
- Body
- TrainResult
Definitions
AnalyzeResult
Analyze API call result.
Name | Path | Type | Description |
---|---|---|---|
errors
|
errors | array of FormOperationError |
List of errors reported during the analyze operation. |
pages
|
pages | array of ExtractedPage |
Page level information extracted in the analyzed document. |
status
|
status | string |
Status of the analyze operation. |
ExtractedKeyValuePair
Representation of a key-value pair as a list of key and value tokens.
Name | Path | Type | Description |
---|---|---|---|
key
|
key | array of ExtractedToken |
List of tokens for the extracted key in a key-value pair. |
value
|
value | array of ExtractedToken |
List of tokens for the extracted value in a key-value pair. |
ExtractedPage
Extraction information of a single page in a with a document.
Name | Path | Type | Description |
---|---|---|---|
clusterId
|
clusterId | integer |
Cluster identifier. |
height
|
height | integer |
Height of the page (in pixels). |
keyValuePairs
|
keyValuePairs | array of ExtractedKeyValuePair |
List of Key-Value pairs extracted from the page. |
number
|
number | integer |
Page number. |
tables
|
tables | array of ExtractedTable |
List of Tables and their information extracted from the page. |
width
|
width | integer |
Width of the page (in pixels). |
ExtractedTable
Extraction information about a table contained in a page.
Name | Path | Type | Description |
---|---|---|---|
columns
|
columns | array of ExtractedTableColumn |
List of columns contained in the table. |
id
|
id | string |
Table identifier. |
ExtractedTableColumn
Extraction information of a column in a table.
Name | Path | Type | Description |
---|---|---|---|
entries
|
entries | array of array |
Extracted text for each cell of a column. Each cell in the column can have a list of one or more tokens. |
items
|
entries | array of ExtractedToken | |
header
|
header | array of ExtractedToken |
List of extracted tokens for the column header. |
ExtractedToken
Canonical representation of single extracted text.
Name | Path | Type | Description |
---|---|---|---|
boundingBox
|
boundingBox | array of double |
Bounding box of the extracted text. Represents the location of the extracted text as a pair of cartesian co-ordinates. The co-ordinate pairs are arranged by top-left, top-right, bottom-right and bottom-left endpoints box with origin reference from the bottom-left of the page. |
confidence
|
confidence | double |
A measure of accuracy of the extracted text. |
text
|
text | string |
String value of the extracted text. |
FormDocumentReport
Name | Path | Type | Description |
---|---|---|---|
documentName
|
documentName | string |
Reference to the data that the report is for. |
errors
|
errors | array of string |
List of errors per page. |
pages
|
pages | integer |
Total number of pages trained on. |
status
|
status | string |
Status of the training operation. |
FormOperationError
Error reported during an operation.
Name | Path | Type | Description |
---|---|---|---|
errorMessage
|
errorMessage | string |
Message reported during the train operation. |
KeysResult
Result of an operation to get the keys extracted by a model.
Name | Path | Type | Description |
---|---|---|---|
clusters
|
clusters | object |
Object mapping ClusterIds to Key lists. |
ModelResult
Result of a model status query operation.
Name | Path | Type | Description |
---|---|---|---|
createdDateTime
|
createdDateTime | date-time |
Get or set the created date time of the model. |
lastUpdatedDateTime
|
lastUpdatedDateTime | date-time |
Get or set the model last updated datetime. |
modelId
|
modelId | uuid |
Get or set model identifier. |
status
|
status | string |
Get or set the status of model. |
ModelsResult
Result of query operation to fetch multiple models.
Name | Path | Type | Description |
---|---|---|---|
models
|
models | array of ModelResult |
Collection of models. |
TrainResult
Response of the Train API call.
Name | Path | Type | Description |
---|---|---|---|
errors
|
errors | array of FormOperationError |
Errors returned during the training operation. |
modelId
|
modelId | uuid |
Identifier of the model. |
trainingDocuments
|
trainingDocuments | array of FormDocumentReport |
List of documents used to train the model and the train operation error reported by each. |