PDFs are no longer where data goes to die

Updated on
October 10, 2023
5
min read
Contributors
Author
PDFs are no longer where data goes to die
Table of contents
Turn documents into structured data
Get started free
Share this post

We believe documents should be as accessible to software as they are to people, and that's why we started Sensible.

The challenge in making documents accessible to software is that documents are really just containers for diverse data. A single document might contain tables, paragraphs, boxes, labels, and images. As a result, the best practice for extracting structured data from documents is to use an ensemble of methods.

For software developers, this creates a tremendous amount of work even when building on top of existing OCR APIs or text extracted directly from PDFs. Many companies have spent months of engineering effort to integrate external documents into their workflows.

For operations leaders, too often you need people power to unlock the data stored in documents. We know the pain of watching headcount grow linearly with servicing volume, whether for customer onboarding, data entry, compliance, or discovery.

In both cases, this effort is not part of the core value that the company is creating, but rather a technical and operational hurdle the company must clear to create value elsewhere.

With Sensible, developers can turn PDFs and document images into structured data in a single afternoon. In turn, this allows operations leaders to focus on high skill, high ROI workloads rather than routine document management.

We accomplish this by providing developers with a wide range of data extraction primitives in a powerful, configurable domain-specific language. These primitives put machine learning and natural language processing techniques at developers' fingertips, which provides transparency and control while being far more concise and less brittle than rule-based methods.

More broadly, Sensible is creating a service for developers to quickly map unstructured and imperfectly structured data to schemas. Documents are the most impactful initial application of this technology, but the same core challenge is present when working with audio, websites, third party APIs, and other sources of data over which the developer does not have full control.

Sensible is the only service you need to connect documents to software, and our API is live in private beta. We're working with companies in the logistics, insurance, real estate, and legal domains to make the documents they handle every day accessible to software. Reach out to us at hello@sensible.so if you'd like to do the same.


Ming Lu
Ming Lu
Co-Founder, Sensible
Ming started her software career at Intercom where she worked a variety of roles across analytics, engineering, and product on their Growth and Data teams. Before Sensible, Ming was the Head of Product at Lattice as the company grew from 0 to 20MM+ ARR.
Turn documents into structured data
Get started free
Share this post

Turn documents into structured data

Stop relying on manual data entry. With Sensible, claim back valuable time, your ops team will thank you, and you can deliver a superior user experience. It’s a win-win.