Form processing AI model requirements and limitations

[This topic is pre-release documentation and is subject to change.]

Note

  • Make sure your administrator has assigned you a security role with all organization privileges over the entity Note from Core Records, and read privilege over the entity you are using to select object names.

Requirements

Form processing works on input documents that meet the following requirements:

  • JPG, PNG, or PDF format (text or scanned). Text-embedded PDFs are preferable because there is no possibility of error in character extraction and location.
  • File size must be less than 4 megabytes (MB).
  • For images, dimensions must be between 50 x 50 and 4200 x 4200 pixels.
  • If scanned from paper documents, scans should be high-quality images.
  • Must use the Latin alphabet (English characters)
  • Printed data (not handwritten)
  • Must contain keys and values (for example, “company: Contoso” works; “Contoso” without a key label is not supported).
  • Keys can appear above or to the left of the values, but not below or to the right.

Optimization tips

  • For best results, use a dataset that is smaller than 4 MB. When your dataset exceeds 4 MB, AI Builder only uses 4 MB of your data to train and predict. Because you can’t control which data exceeding the 4 MB limit is not used, you should optimize your data to stay under 4 MB.
  • You can optimize PDF files by using the Print > Print to PDF option to select certain pages within your document.
  • Use a dataset that consists of related documents, such as separate instances of the same form.

Note

AI Builder does not currently support the following types of form processing input data:

  • Complex tables (nested tables, merged headers or cells, and so on)
  • Check boxes or radio buttons
  • PDF documents longer than 50 pages

Next step

Create a form processing model