Extract invoices to structured JSON

Invoices document what was sold, the amount owed, and when payment is due. Every vendor formats invoices differently, from accounting software exports to manual PDFs. Sensible converts invoice data into structured JSON for accounts payable automation and three-way matching.

Why invoices defeat one-size-fits-all extraction

Infinite vendor formats, line item tables with taxes and discounts, and headers that appear in different locations on every invoice. Hybrid extraction handles the variation; deterministic validation catches calculation errors.

Vendor Format Diversity

QuickBooks exports, SAP invoices, handwritten bills, and custom ERP outputs each place fields differently. LLM parsing reads any layout; SenseML rules map the result to your AP schema consistently.

Line Item Table Extraction

Multi-page tables, merged cells, multi-line descriptions, and varying tax codes. Line items get extracted with quantity, unit price, tax, and total intact. Page breaks are handled transparently.

Calculation Validation

Do line items sum to the subtotal? Does the tax match the rate? Sensible validates these calculations and flags discrepancies with confidence scores. Your AP team reviews exceptions, not every invoice.

Fields we extract

Default fields cover three-way matching. Configure additional fields for your AP automation workflow.

Header

Invoice number, invoice date, due date, PO number, vendor name, vendor address, bill-to name, bill-to address, payment terms

Line items

Description, quantity, unit price, unit of measure, discount, tax amount, extended amount, SKU/part number

Totals

Subtotal, discount total, tax total, shipping/freight, grand total, amount paid, balance due, currency


{ /* SenseML: invoice extraction */
"fields": [
{
"method": {
"id": "queryGroup",
"queries": [
{
// Invoice number
"id": "invoice_number",
"description": "invoice number, invoice #, invoice no"
},
{
// Invoice date
"id": "invoice_date",
"description": "invoice date, date of invoice, bill date",
"type": { "id": "date" }
},
{
// Grand total amount due
"id": "total_amount",
"description": "total amount due, grand total, total, amount due",
"type": { "id": "currency" }
},
{
// Vendor name
"id": "vendor_name",
"description": "vendor name, from, billed by, seller name"
}
// Additional fields for line items, PO number, due date, tax, etc.
]
}
}
]
}
Freight Invoice

Shipping and transportation invoice with line haul, fuel surcharge, and accessorials.

Utility Invoice

Monthly utility bill formatted as an invoice with usage and rate detail.

Construction Invoice

Progress billing invoice for construction projects with retention and change orders.

Medical Invoice

Healthcare provider invoice with procedure codes and insurance adjustments.

Standard Invoice

Common invoice format from accounting software like QuickBooks or Xero.

Supported invoice formats

Sensible processes invoices from any vendor, any accounting system, any country. New formats can be configured in hours. The extraction logic is explicit in SenseML, not hidden in prompt tuning.

By source

QuickBooks, Xero, NetSuite, SAP, Oracle, FreshBooks, Wave, custom/manual invoices

By type

Standard invoices, credit memos, debit notes, proforma invoices, recurring invoices, construction progress billing

Trusted by operations and engineering teams at

Common Questions

Answers about vendor format support, line item extraction, and calculation validation.

Can Sensible help with PO matching?

Sensible extracts PO reference numbers from invoices. You can match these against purchase orders processed through Sensible to verify quantities, pricing, and terms programmatically.

Does Sensible validate invoice calculations?

Yes. Validation rules cross-check that line item totals sum to the subtotal, that tax calculations are correct, and that the grand total matches. Discrepancies are flagged automatically.

How does Sensible extract line items from invoices?

Sensible extracts each line item with description, quantity, unit price, and total. It handles multi-page tables, subtotals, tax amounts, and grand totals across any invoice layout.

Can Sensible handle invoices from any vendor?

Yes. Sensible processes invoices regardless of format or vendor. Common fields like invoice number, date, due date, vendor name, and totals are extracted from any layout.

Do you support webhooks?

Yes. Sensible sends extraction results to your webhook endpoint when processing completes. You can also poll the API for status.

Does Sensible support human review?

Yes. Sensible flags extractions with low confidence for human review. You can configure review thresholds and workflows.

What security certifications does Sensible have?

Sensible is SOC 2 Type II certified and HIPAA compliant. Data is encrypted in transit and at rest.

How long is document data retained?

Document data is stored indefinitely by default. Custom retention policies are available and can be configured for same-day deletion if needed.

Is there a free trial?

Yes. Sensible offers a 14-day free trial on the Growth plan. No credit card required to start.

How is pricing structured?

Sensible uses per-document pricing for predictable costs. No token-based billing or usage surprises. Volume discounts are available for higher throughput.

How do I integrate with Sensible?

Sensible provides REST APIs and SDKs for Python and Node.js. Most integrations take a few hours. Webhooks, Zapier, and direct API calls are all supported.

What file formats does Sensible support?

Sensible processes PDFs (native or scanned), Microsoft Word (DOC, DOCX), spreadsheets (XLSX, XLS, CSV), single-page images (JPEG, PNG), multi-page images (TIFF), and email bodies with attachments.

How accurate is the extraction?

Accuracy depends on document quality and configuration. Most production deployments achieve 95%+ accuracy with proper validation rules and confidence signals.

How fast is document processing?

Processing speed depends on document size, page count, OCR requirements, and which extraction methods are used. Simple single-page documents process in seconds. Larger or more complex documents that use LLM-based extraction take longer.