Extract claims forms to structured JSON

Claims forms like CMS-1500, UB-04, and ADA pack hundreds of fields into dense gridded layouts for healthcare reimbursement. A single wrong character means a denied claim. Use Sensible to get structured claims data for adjudication, compliance, and analytics.

Why claims forms challenge extraction tools

Dense field grids, tiny print, and strict positional requirements make claims forms demanding.

Dense Field Grid Parsing

Thirty-three numbered boxes on a single page. Sub-fields, checkboxes, multi-line entries crammed into tiny cells. Each box position is mapped precisely, with extraction anchored to the CMS-1500's physical layout.

Checkbox and Code Accuracy

ICD-10 codes, CPT codes, and NPI numbers are sequences where one wrong character changes the clinical meaning. Validation rules check format compliance on every code field, catching OCR misreads before they propagate.

Form Type Detection

CMS-1500 for professional claims. UB-04 for institutional. ADA for dental. Each form has its own box numbering and field semantics. Sensible detects the form type and applies the correct extraction configuration automatically.

Fields we extract

Box-level extraction maps to your claims adjudication schema. Add custom validation rules as needed.

Patient and provider

Patient name, DOB, member ID, group number, provider name, NPI, tax ID, referring provider, facility name/NPI

Diagnosis and procedure

ICD-10 codes (primary through quaternary), CPT/HCPCS codes, modifiers, place of service, dates of service, units, charges

Billing and authorization

Total charges, amount paid, balance due, prior authorization number, assignment of benefits, signature, employer information


{ /* SenseML: CMS-1500 claims form extraction */
"fields": [
{
"method": {
"id": "queryGroup",
"queries": [
{
// Patient name (Box 2)
"id": "patient_name",
"description": "patient name, patient's name, box 2"
},
{
// Primary diagnosis ICD-10 code (Box 21)
"id": "primary_diagnosis",
"description": "primary diagnosis code, ICD-10, diagnosis 1, box 21",
"type": {
"id": "custom",
"pattern": "[A-Z][0-9]{2}\\.?[0-9A-Z]{0,4}"
}
},
{
// Total charges (Box 28)
"id": "total_charges",
"description": "total charge, total charges, box 28",
"type": { "id": "currency" }
},
{
// Provider NPI (Box 33a)
"id": "billing_npi",
"description": "billing provider NPI, NPI number, box 33a",
"type": {
"id": "custom",
"pattern": "[0-9]{10}"
}
}
// Additional fields for CPT codes, service dates, modifiers, etc.
]
}
}
]
}
ADA Dental Claim Form

American Dental Association standard claim form for dental services.

UB-04

Institutional healthcare claim form used by hospitals and inpatient facilities.

CMS-1500

Standard professional healthcare claim form used by physicians and outpatient facilities.

Supported claims form types

Pre-built configurations cover CMS-1500 and UB-04. New claims form types can be configured in hours. Hybrid extraction handles both the structured box fields and any free-text clinical sections.

Healthcare

CMS-1500 (professional), UB-04/CMS-1450 (institutional), ADA Dental Claim Form, pharmacy claims (NCPDP)

Insurance

ACORD claims forms, state-specific first report of injury, workers comp forms, auto injury claims

Trusted by operations and engineering teams at

Common Questions

Answers about CMS-1500 and UB-04 extraction, code validation, and form type detection.

What service line details does Sensible extract?

Sensible captures date of service, place of service, CPT/HCPCS code, modifiers, diagnosis pointers, charges, units, and rendering provider NPI for each service line.

What diagnosis information does Sensible extract from claim forms?

Sensible extracts ICD-10 diagnosis codes, diagnosis pointers, date of onset, and related cause indicators. For CMS-1500, all 12 diagnosis code fields are captured.

Does Sensible capture provider information from claim forms?

Yes. Sensible extracts billing provider name, NPI, tax ID, address, rendering provider, referring provider, and facility information from both CMS-1500 and UB-04 forms.

Does Sensible support both CMS-1500 and UB-04 claim forms?

Yes. Sensible has pre-built configurations for CMS-1500 (professional claims) and UB-04 (institutional claims). Each form's unique field layout is mapped precisely.

Do you support webhooks?

Yes. Sensible sends extraction results to your webhook endpoint when processing completes. You can also poll the API for status.

Does Sensible support human review?

Yes. Sensible flags extractions with low confidence for human review. You can configure review thresholds and workflows.

What security certifications does Sensible have?

Sensible is SOC 2 Type II certified and HIPAA compliant. Data is encrypted in transit and at rest.

How long is document data retained?

Document data is stored indefinitely by default. Custom retention policies are available and can be configured for same-day deletion if needed.

Is there a free trial?

Yes. Sensible offers a 14-day free trial on the Growth plan. No credit card required to start.

How is pricing structured?

Sensible uses per-document pricing for predictable costs. No token-based billing or usage surprises. Volume discounts are available for higher throughput.

How do I integrate with Sensible?

Sensible provides REST APIs and SDKs for Python and Node.js. Most integrations take a few hours. Webhooks, Zapier, and direct API calls are all supported.

What file formats does Sensible support?

Sensible processes PDFs (native or scanned), Microsoft Word (DOC, DOCX), spreadsheets (XLSX, XLS, CSV), single-page images (JPEG, PNG), multi-page images (TIFF), and email bodies with attachments.

How accurate is the extraction?

Accuracy depends on document quality and configuration. Most production deployments achieve 95%+ accuracy with proper validation rules and confidence signals.

How fast is document processing?

Processing speed depends on document size, page count, OCR requirements, and which extraction methods are used. Simple single-page documents process in seconds. Larger or more complex documents that use LLM-based extraction take longer.