Overview

How the Extract API request and response flow works.

The Extract API accepts a document plus a schema definition and returns structured JSON that matches your schema.

Request Shape

Field	Type	Description
`file`	File	PDF or image upload sent as `multipart/form-data`.
`schema`	String	Inline JSON schema for the extraction result.
`schema_id`	UUID	Saved schema ID from your Struct PDF account.
`Authorization`	Header	Bearer token header: `Bearer <API_KEY>`.
`X-API-Key`	Header	Alternative API key header if you do not use `Authorization`.

You can provide either schema or schema_id. Supported uploads include PDFs and common image formats such as PNG and JPEG.

Response Shape

Field	Type	Description
`generationId`	UUID	Extraction request identifier.
`success`	String	Overall extraction status.
`result`	Object	Structured JSON matching your schema.
`metadata.findings`	Array	Evidence and snippets for extracted fields.
`metadata.errors`	Array	Field-level extraction issues, if any.

Findings

The metadata.findings array follows the schema you provide. Each finding points back to a field in your requested output shape, so you can map extracted evidence directly to the same structure you expect in result.

Struct PDF uses dot notation in schema_key to describe where the evidence belongs:

Pattern	Meaning	Example
flat field	top-level field in your result	`total`
nested object	field inside an object	`address.city`
array item	field inside a specific array item	`items.0.name`

That means the evidence format stays predictable even when your schema contains nested objects or arrays.

Requested shape	Example finding keys
`customer.email`	`customer.email`
`items[].price`	`items.0.price`, `items.1.price`

This makes it easier to:

connect extracted values back to UI fields
show evidence next to the exact field a user cares about
troubleshoot ambiguous extractions without guessing where a finding belongs

Error Handling

The success field reports the overall extraction outcome:

Status	Meaning
Complete	The requested fields were extracted without field-level issues.
Partial	The API returned a usable result, but one or more fields also produced issues in `metadata.errors`.
Fail	The extraction could not produce a usable result for the request.

Partial is the most common non-terminal state. For example, if a document contains multiple plausible values for the same field, Struct PDF may still return a result while recording an error for that schema key. That lets you keep the successful parts of the extraction while also surfacing what needs review.

Example: if a document contains several different values that could all map to the same field, the extraction may return Partial so you still receive the usable output together with the ambiguity in metadata.errors.

Use metadata.errors together with metadata.findings when you need to:

detect fields that need manual review
explain why a value was not returned cleanly
handle ambiguous documents where several matches appear on the page

OpenAPI Schema and Tools

Struct PDF publishes a standard OpenAPI schema, which means you can plug the API into tools that understand Swagger and OpenAPI without hand-writing the full contract yourself. That includes API explorers, code generators, typed clients, and internal developer tooling.

The OpenAPI document is available in the live API Reference, alongside the machine-readable schema.

import SwaggerClient from 'swagger-client';

const client = await SwaggerClient({
  url: 'https://api.structpdf.com/openapi.json',
  requestInterceptor: (request) => {
    request.headers.Authorization = `Bearer ${process.env.STRUCTPDF_API_KEY}`;
    return request;
  },
});

const formData = new FormData();
formData.append('file', file, 'receipt.pdf');
formData.append(
  'schema',
  JSON.stringify({
    type: 'object',
    properties: {
      total: { type: 'number' },
      tax: { type: 'number' },
    },
  }),
);

const response = await client.apis.default.extract({
  file: formData.get('file'),
  schema: formData.get('schema'),
});

console.log(response.body);

Zod-based Schema Format

Struct PDF works well with schemas that originate from Zod. That makes it easier to keep your extraction shape close to the validation rules you already use in your app, then convert that shape into the JSON schema sent to the API or managed through the Schema Builder.

You can define a schema in Zod, convert it to JSON schema, and send the result directly to the Extract API:

import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

const ReceiptSchema = z.object({
  guest_count: z.number(),
  tax: z.number(),
  total: z.number(),
  tip: z.number(),
  subtotal: z.number(),
});

const extractionSchema = zodToJsonSchema(ReceiptSchema, 'ReceiptSchema');

const formData = new FormData();
formData.append('file', file, 'receipt.pdf');
formData.append('schema', JSON.stringify(extractionSchema));

const response = await fetch('https://api.structpdf.com/v1/extract', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.STRUCTPDF_API_KEY}`,
  },
  body: formData,
});

console.log(await response.json());

Live API Reference

Interactive testing, complete request details, and more language examples.