FAQ
Frequently asked questions about extracting structured data from PDFs and images
What does the Extraction API return?
Send a document or image plus a JSON schema and get back structured JSON shaped exactly like your schema, alongside per-field findings, snippets, and page references that explain where each value came from.
Which file formats are supported?
PDFs and common image formats all flow through the same extraction endpoint, so scans, mobile photos, screenshots, and digital documents can be ingested in one call.
| Format | Typical source |
|---|---|
| Documents, exports, reports | |
| PNG | Screenshots, exported images |
| JPG / JPEG | Scans, camera photos |
| HEIC | iPhone camera uploads |
How is usage priced?
Extractions are billed per page in credits. There is no separate parse charge before extraction — the credit you see is the credit you pay.
| Extraction | Credit usage |
|---|---|
| Simple | 1 credit per page |
| Complex | 2 credits per page |
Is there a free tier?
Yes. Sign up for a free account and run extractions on the included monthly credits with no credit card required. Upgrade to Pay-As-You-Go or a subscription only when your volume grows.
Can I customize the fields I extract?
Yes. Edit the schema in the playground or build it visually in the Schema Builder. Save it once and reuse it in production via schema_id, including nested objects and arrays such as items[].
How accurate is extraction on messy documents?
The API is tuned for varied layouts, noisy scans, and mixed document types. Every response also includes metadata.findings and field-level metadata.errors so you can route edge cases to manual review without losing the fields that extracted cleanly.
Do I have to sign up to try it?
No. Run the demo on a sample document without an account. When you upload your own file, we ask you to sign in with Google or a magic link to run your first extraction. Your file and schema are saved automatically so you pick up where you left off.