POST any document + a JSON schema. Get back validated, structured data. No training, no templates, no setup. Works on invoices, contracts, receipts, forms — anything.
curl -X POST https://extractapi.net/v1/extract \ -H "Authorization: Bearer ex_your_api_key" \ -F "file=@invoice.pdf" \ -F 'schema={ "invoice_number": "string", "vendor_name": "string", "total_amount": "number", "due_date": "string", "line_items": [{"description": "string", "amount": "number"}] }'
{
"request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"data": {
"invoice_number": "INV-2026-0142",
"vendor_name": "Acme Corporation",
"total_amount": 1250.00,
"due_date": "2026-04-15",
"line_items": [
{ "description": "Consulting services", "amount": 1000.00 },
{ "description": "Travel expenses", "amount": 250.00 }
]
},
"meta": {
"pages_processed": 1,
"latency_ms": 2340,
"pages_remaining": 49
}
}
One endpoint. Any document. Structured output.
PDF, images, DOCX, TXT. Invoices, contracts, receipts, medical forms, tax documents — no pre-training required.
Define your JSON schema, get exactly those fields back. Nested objects, arrays, typed values — all validated.
No templates, no training data, no config files. Send a document you've never seen before and get structured data back.
REST API with Bearer auth. OpenAPI spec at /openapi.json. Works with LangChain, CrewAI, AutoGen, or any HTTP client.
Upload 20 files with one schema in a single request. Perfect for processing folders of invoices, statements, or forms.
Track pages processed, latency, and remaining quota via the /v1/usage endpoint. Build billing into your own product.
Vendor, amount, line items, dates, tax — from any format or layout.
Parties, clauses, obligations, effective dates, termination terms.
Transactions, balances, account numbers, date ranges.
Patient info, diagnoses, medications, lab values, dates.
W-2s, 1099s, tax returns, permits, applications — any form.
Leases, appraisals, inspection reports, title documents.
Bills of lading, packing slips, customs declarations.
If it's a document with data, ExtractAPI can extract it. No limits on document types.
Private onboarding and controlled rollout.