How two companies tackled utility and telecom bill processing with contrasting approaches

Updated on

January 27, 2026

min read

Contributors

No items found.

Author

Frances Elliott

Table of contents

For businesses managing vendor relationships and operational costs, utility and telecom bills are critical data sources. Whether it's verifying that charges align with contracts or building analytics dashboards for clients, extracting structured data from these documents at scale is essential.
‍

Two companies recently partnered with Sensible to automate their bill processing workflows. Both handle highly structured, complex bill documents. Both chose deterministic, layout-based extraction over LLM approaches. But their business needs are a study in contrast. One processes thousands of energy and telecomm bills across 2,000+ carriers, prioritizing schema precision for auditing. The other processes telecomm documents that can stretch to tens of thousands of pages, prioritizing granular data extraction for analytics. Their stories illustrate how the same document type can demand vastly different solutions.
‍
‍

Company A: Expense management at carrier scale
‍

Company A specializes in telecom, utility, and technology vendor management. Their core service involves helping businesses audit charges against contracts, automate payment allocations, and ensure they're not overpaying. Every month, they process approximately 20,000 bills from an enormous variety of sources.
‍

The challenge: 2,000+ carriers and a brittle system
‍

Company A had built an internal solution in C++ to parse bills. It worked…until it didn't. The system was brittle. When something broke, it was "all hands on deck" for days to fix it. Building a new template for a new carrier format could take up to two weeks. With a backlog of formats to onboard and over 2,000 carriers to support, the maintenance burden had become untenable.
‍

"It's one thing if there's two templates causing issues," their team explained. "It's another thing if there's a hundred."
‍

They'd talked to virtually every vendor in the market. What they needed was a solution that could handle massive format variation while delivering data in a precise, predictable schema—and a partner who could share the implementation workload.
‍

The solution: Managed services with complex conditional logic
‍

Company A opted for Sensible's managed services model, establishing a true 50/50 partnership with weekly check-ins. The Sensible team builds and maintains extraction configurations while Company A's team provides detailed specifications for each new format.
‍

The implementation required sophisticated conditional logic to handle the complexity of their schema requirements. For energy bills, the output schema changes depending on whether there's a meter with readings, a single meter, or multiple meters. The team uses conditional logic throughout: if there's a meter, extract data one way; if there's no meter, handle it differently.
‍

One particularly interesting challenge involved spatial discontinuity. Charges appear in repeating sections, and sometimes charges appear in a completely different location on the page from the meters they're associated with. Using transformation logic, the Sensible team built configurations that correctly associate charges with their parent meters even when they're not spatially nested in the document.

‍

‍

Why layout-based extraction rather than LLM-based extractions? The complexity of the sections and the strictness of the schema requirements made it the clear choice. As their implementation lead put it: "LLMs aren’t great at extracting long lists of repeating data. It's quicker and more reliable to do it with layout."
‍

The results: From weeks to minutes
‍

Within three months of launching their first templates, Company A's operations team reported feeling the lift. Templates that once took two weeks to build now come online far faster. Bills that required manual intervention now process automatically. Most importantly, the data arrives in exactly the schema their downstream systems expect.
‍

Clarifying schema requirements was a key part of the partnership model. Understanding what data to pull, how to pull it, and what edge cases to handle took the longest to get right, because the correspondence between company A’s required output schema versus the document content wasn’t usually obvious. But once that foundation was established, the team hit a sustainable rhythm. Company A can now scale their extraction capacity without the engineering burden that strained their internal system.

‍
‍

Company B: Analytics at document scale
‍

Company B is a technology consultancy that offers, among other services, telecom expense management analytics. They parse bills and build dashboards that help their clients understand spending patterns, such as where costs are increasing, which accounts are driving charges, and how usage breaks down across hundreds of phone lines.

‍

The challenge: Documents measured in tens of thousands of pages
‍

Company B's challenge wasn't format variety, it was sheer document size. A single telecom bill from a large enterprise client might contain 20,000 pages documenting hundreds of phone numbers, each with its own call logs, invoices, and international usage details.
‍

Their internal solution couldn't handle documents at this scale. They evaluated multiple vendors, but none could solve the large document problem. They needed a partner willing to engineer custom solutions and commit to a long-term relationship. Switching document processing infrastructure is painful, and they wanted to make this decision once.

‍

The solution: Custom engineering for massive documents
‍

Company B's implementation required custom engineering from the Sensible team. The approach: intelligent document splitting. Rather than processing a 30,000-page document as a single unit, the system identifies logical split points—typically section headers—and processes chunks in parallel before unifying the results.
‍

This divide-and-conquer strategy works because telecom bills, despite their length, have consistent internal structure. Headers indicate new sections; within each section, the data follows predictable patterns. If a header changes unexpectedly, the Sensible system flags it, but that's rare.
‍

Another key feature: layout linearization. Some bill pages use a "snake" layout where content flows in multiple columns. The Sensible team built preprocessing that converts these complex layouts into single-column formats, ensuring sections break correctly and data associates with the right account.

‍

‍

The extraction requirements themselves are straightforward but extensive. Company B wants everything: every phone number, every call log, every charge detail. Where Company A focuses on high-level auditing (do the line items add up to the total?), Company B needs granular data to power analytics dashboards that answer questions like "How much did international calls increase this quarter?"

‍

The results: A scalable foundation for growth
‍

Company B now processes documents that would have been impossible with their previous system. The engineering investment paid off: once the splitting and linearization infrastructure was in place, adding new carrier formats became relatively simple. The document structure is consistent; it's just a matter of building configurations for each carrier's specific layout.
‍

The relationship operates through biweekly meetings and dedicated Slack channels, with a Sensible engineer in direct contact with Company B's engineering lead. It's the kind of deep technical partnership that made Company B comfortable going all in on this decision.

‍
‍

Two paths to the same destination
‍

These implementations reveal how the same document category—utility and telecom bills—can demand fundamentally different approaches:
‍

Company A optimized for format breadth and schema precision. With 2,000+ carriers and strict downstream requirements, they needed configurations that handle massive variation while delivering predictable, auditable output. Managed services let them scale without rebuilding their engineering team.
‍

Company B optimized for document depth and data granularity. With documents stretching to tens of thousands of pages, they needed custom engineering to make extraction physically possible. Once that infrastructure existed, the extraction itself was relatively straightforward.
‍

Both chose deterministic, layout-based extraction over LLM approaches—but for different reasons:
‍

Company A: Schema requirements were too strict and sections too complex for LLMs to reliably interpret. The precision of layout-based methods guaranteed the exact output format their systems required.

Company B: Documents were too long and too dense. Telecom bills pack enormous amounts of small text into compact layouts. Layout-based extraction handles this complexity more reliably than LLMs, which can struggle with visual density.

‍
‍

Key takeaways for bill processing automation
‍

If you're evaluating solutions for utility or telecom bill extraction:
‍

Assess your primary constraint: Is it format variety (hundreds of carriers) or document scale (thousands of pages)? The answer shapes your entire implementation strategy.
Consider schema complexity early: If downstream systems require precise, conditional schemas—meters with readings handled differently than meters without—build that logic into your extraction design from the start.
Plan for partnership: Both companies emphasized that interpretation took the longest. Understanding what to extract, how to handle edge cases, and what the output should look like requires close collaboration between your team and your extraction partner.

Evaluate LLM vs. deterministic honestly: LLMs work well for short documents with clear field labels. Dense, complex bills with strict schema requirements often benefit from deterministic approaches that guarantee consistent output.

‍

Book a demo to discuss your bill processing requirements, or explore our managed services to see how we can handle template creation and maintenance for you.

Frances Elliott

Turn documents into structured data

Stop relying on manual data entry. With Sensible, claim back valuable time, your ops team will thank you, and you can deliver a superior user experience. It’s a win-win.

Start Extracting Talk to our team

Take a look at some other helpful articles and tutorials.

View all

min read

How a construction compliance company tackled "impossible" payroll report extraction

A construction compliance software company had given up on automating payroll report data extraction after every vendor they tested failed to handle the documents' OCR challenges, format inconsistencies, and dense tabular layouts. Sensible solved the problem by combining deterministic layout-based methods with targeted LLM extraction for trouble spots, along with aggressive text preprocessing and customer-specific configurations rather than one-size-fits-all templates. The key lesson: hybrid approaches and "good enough" automation with some manual review beat either holding out for a perfect universal solution or resigning yourself to pure manual entry.

min read

How a healthcare benefits company replaced a failing vendor and went live with 30 configurations in weeks

A healthcare benefits company processing Explanation of Benefits (EOB) documents from 20–40 insurance carriers needed a fast, accurate way to extract and normalize data across all those formats. By using Sensible's layout-based extraction methods alongside a postprocessing layer to enforce a consistent output schema, they went from signed contract to 30 live configurations in under two months. The implementation succeeded by matching deterministic extraction techniques to well-structured documents, proving that deterministic methods, applied thoughtfully, outperform more complex LLM approaches when the documents don't require them.

min read

How a point-of-sale lender tackled the messiest document type in financial services

A Canadian point-of-sale lender needed a faster, cheaper way to extract banking data from void checks—documents that routinely arrive as blurry photos, screenshots, and degraded scans. Their solution layers template-based extraction for major banks, regex parsing for standardized MICR data, and LLM-based extraction for unstructured fields, with validation logic that returns null rather than a plausible-looking wrong answer. The result replaced a slow, expensive human-review process with one that completes in seconds—at significantly lower cost and with broader bank coverage.

min read

How a Spanish compliance startup automated extraction from 80-page corporate documents

A Spanish compliance startup faced a daunting data extraction problem: pulling structured shareholder and transaction data from dense, 80-page corporate deeds written in multiple languages. Their solution combined deterministic document slicing with LLM-based extraction and intelligent name-matching to merge scattered information into unified records. The result cut an hours-long manual process down to minutes, with roughly 50 data points reliably extracted per document.

Turn documents into structured data

Related posts

How a construction compliance company tackled "impossible" payroll report extraction

How a healthcare benefits company replaced a failing vendor and went live with 30 configurations in weeks

How a point-of-sale lender tackled the messiest document type in financial services

How a Spanish compliance startup automated extraction from 80-page corporate documents