Overview

How the Extract API request and response flow works.

The Extract API accepts a document plus a schema definition and returns structured JSON that matches your schema.

Request Shape

FieldTypeDescription
fileFilePDF or image upload sent as multipart/form-data.
schemaStringInline JSON schema for the extraction result.
schema_idUUIDSaved schema ID from your Struct PDF account.
AuthorizationHeaderBearer token header: Bearer <API_KEY>.
X-API-KeyHeaderAlternative API key header if you do not use Authorization.

You can provide either schema or schema_id. Supported uploads include PDFs and common image formats such as PNG and JPEG.

Response Shape

FieldTypeDescription
generationIdUUIDExtraction request identifier.
successStringOverall extraction status.
resultObjectStructured JSON matching your schema.
metadata.findingsArrayEvidence and snippets for extracted fields.
metadata.errorsArrayField-level extraction issues, if any.

Findings

The metadata.findings array follows the schema you provide. Each finding points back to a field in your requested output shape, so you can map extracted evidence directly to the same structure you expect in result.

Struct PDF uses dot notation in schema_key to describe where the evidence belongs:

PatternMeaningExample
flat fieldtop-level field in your resulttotal
nested objectfield inside an objectaddress.city
array itemfield inside a specific array itemitems.0.name

That means the evidence format stays predictable even when your schema contains nested objects or arrays.

Requested shapeExample finding keys
customer.emailcustomer.email
items[].priceitems.0.price, items.1.price

This makes it easier to:

  • connect extracted values back to UI fields
  • show evidence next to the exact field a user cares about
  • troubleshoot ambiguous extractions without guessing where a finding belongs

Error Handling

The success field reports the overall extraction outcome:

StatusMeaning
CompleteThe requested fields were extracted without field-level issues.
PartialThe API returned a usable result, but one or more fields also produced issues in metadata.errors.
FailThe extraction could not produce a usable result for the request.

Partial is the most common non-terminal state. For example, if a document contains multiple plausible values for the same field, Struct PDF may still return a result while recording an error for that schema key. That lets you keep the successful parts of the extraction while also surfacing what needs review.

Example: if a document contains several different values that could all map to the same field, the extraction may return Partial so you still receive the usable output together with the ambiguity in metadata.errors.

Use metadata.errors together with metadata.findings when you need to:

  • detect fields that need manual review
  • explain why a value was not returned cleanly
  • handle ambiguous documents where several matches appear on the page

OpenAPI Schema and Tools

Struct PDF publishes a standard OpenAPI schema, which means you can plug the API into tools that understand Swagger and OpenAPI without hand-writing the full contract yourself. That includes API explorers, code generators, typed clients, and internal developer tooling.

The OpenAPI document is available in the live API Reference, alongside the machine-readable schema.

ts
import SwaggerClient from 'swagger-client';

const client = await SwaggerClient({
  url: 'https://api.structpdf.com/openapi.json',
  requestInterceptor: (request) => {
    request.headers.Authorization = `Bearer ${process.env.STRUCTPDF_API_KEY}`;
    return request;
  },
});

const formData = new FormData();
formData.append('file', file, 'receipt.pdf');
formData.append(
  'schema',
  JSON.stringify({
    type: 'object',
    properties: {
      total: { type: 'number' },
      tax: { type: 'number' },
    },
  }),
);

const response = await client.apis.default.extract({
  file: formData.get('file'),
  schema: formData.get('schema'),
});

console.log(response.body);

Zod-based Schema Format

Struct PDF works well with schemas that originate from Zod. That makes it easier to keep your extraction shape close to the validation rules you already use in your app, then convert that shape into the JSON schema sent to the API or managed through the Schema Builder.

You can define a schema in Zod, convert it to JSON schema, and send the result directly to the Extract API:

ts
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

const ReceiptSchema = z.object({
  guest_count: z.number(),
  tax: z.number(),
  total: z.number(),
  tip: z.number(),
  subtotal: z.number(),
});

const extractionSchema = zodToJsonSchema(ReceiptSchema, 'ReceiptSchema');

const formData = new FormData();
formData.append('file', file, 'receipt.pdf');
formData.append('schema', JSON.stringify(extractionSchema));

const response = await fetch('https://api.structpdf.com/v1/extract', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.STRUCTPDF_API_KEY}`,
  },
  body: formData,
});

console.log(await response.json());
Next

Live API Reference

Interactive testing, complete request details, and more language examples.