Adversarial synthetic data for AI document extraction

The synthetic documents that broke every frontier model we tested, with ground-truth labels attached.

Your AI agents can't handle data that looks like what they'll actually see

Customer data is off-limits

HIPAA, SOC2, DPAs. The hard cases you'd most want to regression-test against are the ones you can't legally keep, reuse, or share with a new model vendor.

Edge cases multiply with every customer

Every new customer brings template variants, scan quality you haven't seen, field combinations your prompt didn't anticipate. The long tail outpaces any internal labeling effort.

Model upgrades shift pipeline behavior

A new frontier model drops and your numbers change. Without a fixed adversarial corpus you can re-run, you can't tell whether the upgrade helped, hurt, or broke specific document types.

Schema and query ambiguity

"Premium" means three different things across forms. Field paths don't match what the model emits. You need synthetic data built around the ambiguous cases to know whether your pipeline survives them.

Every frontier model we tested hallucinated values on these documents. GPT-5.4 reported a $42M revenue figure as $21.65M.

148 adversarial documents · 5 frontier models tested · 9 document categories · Public dataset on HuggingFace

Synthetic Insurance Documents

65 patterns
82 carrier templates
19 ACORD forms
56 visual variants

Complete document packets with ground truth at three levels: document, field, and bounding box. Loss runs, ACORD forms, SOVs, dec pages, broker narratives, and more, each rendered through 82 carrier-specific templates with 56 visual variants sourced from real reference PDFs.

Pattern categories
Table structure Data formats Visual overlays PDF internals Scan effects Cross-doc inconsistency
Example patterns
Broken CMaps Kerning-as-spaces Bezier curve borders Invisible OCR layers Merged headers JS-computed fields Reserve vs Outstanding Phone photo warp Font glyph corruption Mainframe print
Document types
Loss runs 19 ACORD forms SOVs Dec pages Broker narratives Financial statements Driver schedules Experience mods

Clone-and-variants

Up to 20 per doc
20 variants per input doc
65 adversarial patterns
48hr typical turnaround

Send us one hard document. We send back up to 20 variants. Same layout and format you sent, with new underlying data and adversarial patterns injected so your model can't memorize what it saw in training. Ground-truth labels attached. For IDP teams that have a known edge case and want a regression suite around it.

What we change
Data injection Layout preserved Format preserved Adversarial patterns Ground truth attached
Example inputs
Spanish invoices 10Q + Excel workbooks CSV / XLSX / PDF Custom doc types, most filetypes
01

Define scope

Tell us the doc type, format variants, and edge cases that matter most for your pipeline.

02

We generate

Our engine builds the documents or logs with the specific layout problems, format variation, and corruption you asked for.

03

You test

Same idea as computer vision: we placed the data, so we already know what's in it. Every file comes with ground truth. No annotation step, no SME bottleneck.

Tim Michaud, Founder
YC Alum · Previously Staff Security Eng @ Moveworks (acq. ServiceNow)

At Moveworks I spent years breaking 250+ AI agents serving Fortune 500 companies as the security eng on those rollouts. Before that I spent a decade in security research finding bugs in Apple, Chrome, and Qualcomm. Aginor came from putting those two things together: I know what production data does to agents, and I know how to generate the inputs that break them.

I read every email. Reach me at tim@aginor.ai.

Start with one hard document

Send us a doc your pipeline struggles with. We send back up to 20 variants with new data and adversarial patterns, ground truth attached.

Send us a doc