Invoice Parsing API

Extract JSON data from invoice PDFs and images using your schema.

Try It Live

Extract structured JSON from the preloaded sample, then switch to your own document when you are ready.

Document

sample-invoice.pdf

Add fields, nested object fields, and lists of objects to build your schema.

Ready to try it on your own document?

We will save your file and schema so you can pick up where you left off after sign-in.

Use via API

Drop the same schema into your stack. Pick a language tab to copy a request that uses the sample document and starter schema.

Invoice Parsing API in TypeScript

Type-safe TypeScript example using fetch and a JSON schema. Drop into Next.js, Express, or any Node runtime.

typescript
import { readFileSync } from 'node:fs';

const schema = {
  "type": "object",
  "properties": {
    "vendor": {
      "type": "string",
      "description": "Vendor or issuing company name"
    },
    "customer": {
      "type": "string",
      "description": "Customer or billed company name"
    },
    "invoice_number": {
      "type": "string",
      "description": "Invoice number or external reference"
    },
    "invoice_date": {
      "type": "string",
      "description": "Invoice issue date"
    },
    "due_date": {
      "type": "string",
      "description": "Invoice due date"
    },
    "subtotal": {
      "type": "number",
      "description": "Subtotal before tax and fees"
    },
    "tax": {
      "type": "number",
      "description": "Tax amount"
    },
    "total": {
      "type": "number",
      "description": "Total amount due"
    }
  },
  "required": [
    "vendor",
    "customer",
    "invoice_number",
    "invoice_date",
    "due_date",
    "subtotal",
    "tax",
    "total"
  ]
};

const file = new File([readFileSync('sample-invoice.pdf')], 'sample-invoice.pdf', {
  type: 'application/pdf',
});

const formData = new FormData();
formData.set('file', file);
formData.set('schema', JSON.stringify(schema));

const response = await fetch('https://api.structpdf.com/v1/extract', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.STRUCTPDF_API_KEY}`,
  },
  body: formData,
});

const data = await response.json();
console.log(data);

Invoice Parsing API in JavaScript

Plain JavaScript / Node.js example using fetch. Works in Node 18+ and modern serverless runtimes.

javascript
const fs = require('node:fs');

const schema = {
  "type": "object",
  "properties": {
    "vendor": {
      "type": "string",
      "description": "Vendor or issuing company name"
    },
    "customer": {
      "type": "string",
      "description": "Customer or billed company name"
    },
    "invoice_number": {
      "type": "string",
      "description": "Invoice number or external reference"
    },
    "invoice_date": {
      "type": "string",
      "description": "Invoice issue date"
    },
    "due_date": {
      "type": "string",
      "description": "Invoice due date"
    },
    "subtotal": {
      "type": "number",
      "description": "Subtotal before tax and fees"
    },
    "tax": {
      "type": "number",
      "description": "Tax amount"
    },
    "total": {
      "type": "number",
      "description": "Total amount due"
    }
  },
  "required": [
    "vendor",
    "customer",
    "invoice_number",
    "invoice_date",
    "due_date",
    "subtotal",
    "tax",
    "total"
  ]
};

const file = new File([fs.readFileSync('sample-invoice.pdf')], 'sample-invoice.pdf', {
  type: 'application/pdf',
});

const formData = new FormData();
formData.set('file', file);
formData.set('schema', JSON.stringify(schema));

const response = await fetch('https://api.structpdf.com/v1/extract', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.STRUCTPDF_API_KEY}`,
  },
  body: formData,
});

const data = await response.json();
console.log(data);

Invoice Parsing API in Python

Python example using requests. Works on Python 3.8+ and integrates cleanly with Django, FastAPI, and ETL pipelines.

python
import json
import os
import requests

schema = {
  "type": "object",
  "properties": {
    "vendor": {
      "type": "string",
      "description": "Vendor or issuing company name"
    },
    "customer": {
      "type": "string",
      "description": "Customer or billed company name"
    },
    "invoice_number": {
      "type": "string",
      "description": "Invoice number or external reference"
    },
    "invoice_date": {
      "type": "string",
      "description": "Invoice issue date"
    },
    "due_date": {
      "type": "string",
      "description": "Invoice due date"
    },
    "subtotal": {
      "type": "number",
      "description": "Subtotal before tax and fees"
    },
    "tax": {
      "type": "number",
      "description": "Tax amount"
    },
    "total": {
      "type": "number",
      "description": "Total amount due"
    }
  },
  "required": [
    "vendor",
    "customer",
    "invoice_number",
    "invoice_date",
    "due_date",
    "subtotal",
    "tax",
    "total"
  ]
}

with open("sample-invoice.pdf", "rb") as fh:
    response = requests.post(
        "https://api.structpdf.com/v1/extract",
        headers={"Authorization": f"Bearer {os.environ['STRUCTPDF_API_KEY']}"},
        files={"file": ("sample-invoice.pdf", fh, "application/pdf")},
        data={"schema": json.dumps(schema)},
    )

response.raise_for_status()
print(response.json())

Invoice Parsing API in Java

Java example using HttpClient. Works on Java 11+ and integrates with Spring Boot or any JVM service.

java
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.UUID;

public class ExtractExample {
  public static void main(String[] args) throws Exception {
    String schema = "{\"type\":\"object\",\"properties\":{\"vendor\":{\"type\":\"string\",\"description\":\"Vendor or issuing company name\"},\"customer\":{\"type\":\"string\",\"description\":\"Customer or billed company name\"},\"invoice_number\":{\"type\":\"string\",\"description\":\"Invoice number or external reference\"},\"invoice_date\":{\"type\":\"string\",\"description\":\"Invoice issue date\"},\"due_date\":{\"type\":\"string\",\"description\":\"Invoice due date\"},\"subtotal\":{\"type\":\"number\",\"description\":\"Subtotal before tax and fees\"},\"tax\":{\"type\":\"number\",\"description\":\"Tax amount\"},\"total\":{\"type\":\"number\",\"description\":\"Total amount due\"}},\"required\":[\"vendor\",\"customer\",\"invoice_number\",\"invoice_date\",\"due_date\",\"subtotal\",\"tax\",\"total\"]}";
    Path file = Path.of("sample-invoice.pdf");
    String boundary = UUID.randomUUID().toString();
    String CRLF = "\r\n";

    var body = new java.io.ByteArrayOutputStream();
    body.writeBytes(("--" + boundary + CRLF
        + "Content-Disposition: form-data; name=\"schema\"" + CRLF + CRLF
        + schema + CRLF).getBytes());
    body.writeBytes(("--" + boundary + CRLF
        + "Content-Disposition: form-data; name=\"file\"; filename=\"sample-invoice.pdf\"" + CRLF
        + "Content-Type: application/pdf" + CRLF + CRLF).getBytes());
    body.writeBytes(Files.readAllBytes(file));
    body.writeBytes((CRLF + "--" + boundary + "--" + CRLF).getBytes());

    HttpRequest request = HttpRequest.newBuilder()
        .uri(URI.create("https://api.structpdf.com/v1/extract"))
        .header("Authorization", "Bearer " + System.getenv("STRUCTPDF_API_KEY"))
        .header("Content-Type", "multipart/form-data; boundary=" + boundary)
        .POST(HttpRequest.BodyPublishers.ofByteArray(body.toByteArray()))
        .build();

    HttpResponse<String> response = HttpClient.newHttpClient()
        .send(request, HttpResponse.BodyHandlers.ofString());
    System.out.println(response.body());
  }
}

Invoice Parsing API in Go

Go example using net/http. Idiomatic Go HTTP client suitable for microservices and serverless functions.

go
package main

import (
	"bytes"
	"fmt"
	"io"
	"mime/multipart"
	"net/http"
	"os"
)

func main() {
	schema := `{"type":"object","properties":{"vendor":{"type":"string","description":"Vendor or issuing company name"},"customer":{"type":"string","description":"Customer or billed company name"},"invoice_number":{"type":"string","description":"Invoice number or external reference"},"invoice_date":{"type":"string","description":"Invoice issue date"},"due_date":{"type":"string","description":"Invoice due date"},"subtotal":{"type":"number","description":"Subtotal before tax and fees"},"tax":{"type":"number","description":"Tax amount"},"total":{"type":"number","description":"Total amount due"}},"required":["vendor","customer","invoice_number","invoice_date","due_date","subtotal","tax","total"]}`

	file, err := os.Open("sample-invoice.pdf")
	if err != nil {
		panic(err)
	}
	defer file.Close()

	var body bytes.Buffer
	writer := multipart.NewWriter(&body)
	_ = writer.WriteField("schema", schema)

	part, err := writer.CreateFormFile("file", "sample-invoice.pdf")
	if err != nil {
		panic(err)
	}
	if _, err := io.Copy(part, file); err != nil {
		panic(err)
	}
	writer.Close()

	req, err := http.NewRequest("POST", "https://api.structpdf.com/v1/extract", &body)
	if err != nil {
		panic(err)
	}
	req.Header.Set("Authorization", "Bearer "+os.Getenv("STRUCTPDF_API_KEY"))
	req.Header.Set("Content-Type", writer.FormDataContentType())

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		panic(err)
	}
	defer resp.Body.Close()

	out, _ := io.ReadAll(resp.Body)
	fmt.Println(string(out))
}

Frequently Asked Questions

What does the Invoice Parsing API return?

Send an invoice PDF or image plus a JSON schema and get back structured JSON shaped exactly like your schema, alongside per-field findings, snippets, and page references that explain where each value came from.

Which file formats are supported?

PDFs and common image formats all flow through the same extraction endpoint, so scans, mobile photos, and digital invoices can be ingested in one call.

FormatTypical source
PDFDigital invoices, vendor exports
PNGScreenshots, exported images
JPG / JPEGScanned invoices, photos
HEICiPhone camera uploads

How is usage priced?

Extractions are billed per page in credits. There is no separate parse charge before extraction — the credit you see is the credit you pay.

ExtractionCredit usage
Simple1 credit per page
Complex2 credits per page

Is there a free tier?

Yes. Sign up for a free account and run extractions on the included monthly credits with no credit card required. Upgrade to Pay-As-You-Go or a subscription only when your volume grows.

Can I customize the fields I extract?

Yes. Edit the schema in the playground or build it visually in the Schema Builder. Save it once and reuse it in production via schema_id, including nested objects and line-item arrays such as items[].

How accurate is extraction on messy invoices?

The API is tuned for multi-vendor invoices, varied layouts, and noisy scans. Every response also includes metadata.findings and field-level metadata.errors so you can route edge cases to manual review without losing the fields that extracted cleanly.

Do I have to sign up to try it?

No. Run the demo on a sample invoice without an account. When you upload your own document, we ask you to sign in with Google or a magic link to run your first extraction. Your file and schema are saved automatically so you pick up where you left off.

Pricing

Start on the free plan, then continue with Pay-As-You-Go or a subscription as your volume grows.

Free
Get started with 250 credits every month
Free
Get Started Free
  • 250 credits/month
  • No credit card required
  • Schema builder (save requires upgrade)
Starter
For getting started and light usage
$19.00/month
Get Started with Starter
  • Up to 1,000 extractions
  • $0.019 per credit
  • 1,000 credits
  • Save and reuse extraction schemas
Popular
Developer
For individual developers and small projects
$39.00/month
Get Started with Developer
  • Up to 2,500 extractions
  • $0.0156 per credit
  • 2,500 credits
  • Save and reuse extraction schemas
Pro
For professional teams and heavier usage
$99.00/month
Get Started with Pro
  • Up to 8,000 extractions
  • $0.0124 per credit
  • 8,000 credits
  • Save and reuse extraction schemas