Instafill.ai Core Algorithm Update (August 2025)

Over the past two weeks, we’ve been rolling out a major core algorithm update to Instafill.ai. This update focuses on two fundamental improvements to our filling technology that address the most common accuracy challenges users face when working with PDF attachments and complex forms.

These aren’t UI tweaks or interface changes – these updates target the core AI that does the actual form filling work. The first update improves how we handle tables that are used in PDFs, making the filling process predictable and systematic. The second update is about using PDF files themselves as source documents, ensuring accurate data extraction even from lengthy or complex layouts.

Update 1: Tables that are used in PDFs

Before this update, tables could be “mostly right” but still miss a cell or put a value in the wrong column. We now fill tables row by row. We first build a dataset from your attachments, then write row 1, row 2, row 3 – not all at once. This row-by-row approach makes table fills predictable.

How row-by-row processing works

The flow has two steps. First, we collect the data from your attachments, which can be long files (70–100+ pages: insurance policies, full patient records, legal case files with exhibits).

Then, once the AI has extracted and organized all available information from your attachments, it begins the systematic filling process. Instead of trying to populate an entire table at once, it processes row 1, then row 2, then row 3, and so on. This approach ensures that each piece of information is properly matched and placed in the correct location.

Real-world applications for complex documents

This helps when a form table needs facts scattered across a long file. In healthcare, a patient packet can include intake forms, labs, notes, and discharge summaries. For insurance claim tables, we pull the right details from across the whole PDF and line them up row by row. Legal teams see the same benefit with large case bundles that include statements, exhibits, and reports.

Handling different data formats

Source data doesn’t need to be pre-formatted as a table. The AI can pull values from free-form text anywhere in the PDF and structure them into rows as it fills.

Example: If the PDF has sentences or bullet lists like “05/01/2025 — MRI — $3,000,” the AI splits that into Date, Description, and Amount and creates a clean row. Repeating lists are treated as table candidates—each item becomes its own row. If a table in the form isn’t auto-recognized, mark the table area once; after that, the same row-by-row fill applies.

Update 2: Using PDF itself as a source file/attachment

PDFs are hard to extract from, especially long or busy layouts. We improved how we read them so fewer details are missed and more fields are filled correctly.

Comprehensive page-by-page processing

We parse attachments one page at a time. For each page we extract text (OCR if needed), grab basic layout hints, normalize, and then move on. Instead of pushing the whole file at once, we parse it page by page so details aren’t dropped. We fully process each page before moving to the next, no matter how long the file is.

Complex layout recognition

Multi-column pages, prints of web portals, scans with images and captions – all go through the same page flow. We read tables and lists where they exist and also pull free text with nearby headings, so fields like “Employer EIN,” “Policy Number,” or “Claimant DOB” are found even when formatting changes. We also reconcile duplicates across pages so the right value wins.

When long PDF files are essential

70–100-page PDFs show up in insurance plan booklets and EOB bundles; healthcare EMR exports and long discharge summaries; legal/compliance packets with exhibits; real-estate/finance loan packages and statement bundles. Previously, working with such extensive documents carried the risk that crucial information buried on later pages might be overlooked during processing. With the new flow, whether the essential detail appears on page 3 or page 73, it receives the same level of systematic attention and processing accuracy.


Technical implementation

The system works with various text patterns in PDF source documents. It helps if data is already in a table, but it isn’t required – free-form text is fine, and the filler structures it during the process. We don’t currently limit the number of attachments, but attach only what you need; too many unrelated files can add noise.

Availability

The update is now fully live for all users, both those on paid plans and trial accounts, as of August 25, 2025. You may have already noticed the improvements in recent days; if not, you will automatically benefit from them going forward. There’s no action required – the improvements apply automatically.


As always, we welcome your feedback at [email protected].