Form-processing model requirements and limitations
Form processing works on input documents that meet the following requirements:
- JPG, PNG, or PDF format (text or scanned). Text-embedded PDFs are better, because there won't be any errors in character extraction and location.
- If your PDFs are password-locked, you must remove the lock before submitting them.
- The combined file size of the documents used for training must not exceed 50 MB, and PDF documents shouldn't have more than 500 pages.
- For images, dimensions must be between 530 × 100 and 4200 × 4200 pixels.
- For PDF files, dimensions must be at most 17 x 17 inches, corresponding to Legal or A3 paper sizes and smaller.
- If scanned from paper documents, scans should be high-quality images.
- Must use the Latin alphabet (English characters).
- Use forms with different values in each field.
- When you create a new form-processing model, upload documents with the same layout where each document is a separate instance. For example, you might have invoices from the same provider, but each uploaded invoice is from a different month.
- If you're getting bad results or low confidence scores for certain fields, create a new form-processing model and upload more documents. The more documents you tag, the more AI Builder will learn how to better recognize the fields.
- You can optimize PDF files by using the Print > Print to PDF option to select certain pages within your document.
AI Builder doesn't currently support the following types of form-processing input data:
- Complex tables (nested tables, merged headers or cells, and so on)
- Check boxes or radio buttons
- Fillable PDFs