Utility Bill OCR for AP, ESG and every team that needs the data .

Extract consumption, reading types, time-of-use tariffs, full charge breakdowns and multi-supply records, typed and structured. Any provider, any format, in under 10 seconds.

Start for free now Try it now

500 free pages included

Sample US utility bill from Liberty Power, fictional data

1 {

2 ·· "document_type" : "utility_bill" ,

3 ·· "extracted_at" : "2023-10-12T18:45:00Z" ,

4 ·· "provider" : "Liberty Power" ,

5 ·· "service_type" : "residential_electricity" ,

6 ·· "provider_address" : "1200 Energy Way, Dallas, TX 75201" ,

7 ·· "customer" : {

8 ···· "name" : "Maria S. Ramirez" ,

9 ···· "service_address" : "456 Oak Ave, Austin, TX 78704"

10 ·· } ,

11 ·· "account_number" : "1234 5678 9012" ,

12 ·· "statement_date" : "2023-10-12" ,

13 ·· "due_date" : "2023-11-05" ,

14 ·· "period" : {

15 ···· "from" : "2023-09-10" ,

16 ···· "to" : "2023-10-10" ,

17 ···· "days" : 30

18 ·· } ,

19 ·· "consumption_kwh" : 927 ,

20 ·· "billing_summary" : {

21 ···· "previous_balance" : 110.20 ,

22 ···· "payments_received" : -110.20 ,

23 ···· "current_charges" : 145.67 ,

24 ···· "total_amount_due" : 145.67

25 ·· } ,

26 ·· "currency" : "USD" ,

27 ·· "co2eq_kg" : 180.4

28 }

Powering document extraction for teams at

If your document has data, we can extract it.

Whatever team you're on, we already extract the fields you need from the same utility bill. Read the technical deep dive →

Greenfield Energy Business electricity supply

Invoice no. UB-2026-05-1873

Customer ACME Ltd. Company No. 12345678

Supply address 123 Liverpool Street London EC2M 7PY · United Kingdom

Billing period 01 May → 31 May 2026

Issue date 02 Jun 2026

Consumption detail

Concept	Period	Quantity	Unit price	Amount
Energy consumption	01–31 May 2026	412 kWh	£0.1512 / kWh	£62.30
Standing charge	31 days	31 days	£0.5968 / day	£18.50
Climate Change Levy	01–31 May 2026	412 kWh	£0.0109 / kWh	£4.50
VAT (20%)	—	—	—	£17.05

Last 12 months (kWh) Reading: estimated

Total due £102.35

MPAN 1900-0001-1234-5678

Tariff Business Fixed 12m

Carbon footprint 53.6 kg CO₂e

Due date 25 Jun 2026

Identity platforms and fintechs verify name, address, issue date and provider, but the real challenge is coverage across the long tail of small municipal and co-op providers that generic OCR misses.
Installers size systems from kWh consumption history, and the extraction has to flag estimated readings explicitly, or proposals get built on non-representative figures.
Energy teams ingest electricity, gas and water across portfolios, a typical 200-site × 15-utility footprint produces thousands of documents per month, every layout and format imaginable.
Scope 2 reporting needs measured consumption, not estimated, extraction must surface the reading type per period and supply point, or the carbon report is unreliable.
Property managers process bills across hundreds of units, different account formats, billing cycles and meter conventions per utility make manual reconciliation simply not viable.
Finance teams post to ERPs and audit charges against contracted tariff terms, which requires the full hierarchical breakdown, not just the total amount due.

extracted_data.json

{
  "tip": "Select a use case",
  "to_see": "extracted JSON"
}

Start processing yours now

Invofox vs. generic OCR and LLM pipelines.

Invoice parsers, form extractors, and LLM prompts produce partial output on utility documents. Here's what changes when the pipeline is built for them. Read the full breakdown →

Purpose-built

Invofox

Trained on utility bills end to end.

Utility-specific model Yes
Reading types (measured / estimated)
Multi-supply PDF splitting Built-in
Custom schema fields Config only
Multi-supply JSON structure Supply points + charges nested
Time-of-use tariff periods
Feedback loop Automatic
Roadmap stability Stable
Maintenance burden Low

Generic

Generic OCR & LLMs

Invoice parsers, form extractors, or a prompt sent to an LLM.

Utility-specific model No, generic invoice/form layer
Reading types (measured / estimated)
Multi-supply PDF splitting Custom build / extra processor
Custom schema fields Retraining or app code
Multi-supply JSON structure Flat JSON per page
Time-of-use tariff periods
Feedback loop Manual retraining
Roadmap stability Mixed, some processors retiring
Maintenance burden Medium to high

Hierarchical, not flat. Built for the bill's real structure.

Utility bills aren't flat. One PDF can carry multiple supply points, each with their own period, consumption, reading type and charge breakdown. Our schema mirrors that.

utility_bill_schema.json

Document 4 fields
1. invoice_number UB-2026-05-1873 string
2. provider Iberdrola string
3. issue_date 2026-06-02 date
4. due_date 2026-06-25 date
supply_points [ ] 6 fields
1. cups ES00220001… string
2. service_address Av. Diagonal 1… string
3. period 2026-05-01 → 31 object
4. consumption_kwh 412 number
5. reading_type "estimated" enum
6. tariff 2.0TD string
charges_breakdown [ ] 5 fields
1. concept "energy" enum
2. qty 412 kWh number
3. unit_price 0.1268 €/kWh number
4. amount 52.30 number
5. tariff_band "P1 peak" enum
Payment & ESG 5 fields
1. amount_due 87.45 number
2. currency "EUR" string
3. iban_last4 "1234" string
4. payment_method "direct_debit" enum
5. co2eq_kg 53.6 number

From PDF to typed data, every step is utility-aware.

Splitting, classification, extraction, validation and review, each layer of the pipeline is designed for what utility bills actually are: hierarchical, multi-supply, multi-tariff documents.

Multi-Supply Identification

Multi-Supply Identification

A single bill can carry electricity and gas on the same page. Each supply point is identified and extracted as a separate record with its own period, consumption, meter reference and charge breakdown. Never collapsed into a single flat object.
Utility-Aware Field Extraction

Utility-Aware Field Extraction

Consumption figures, tariff periods, supply-point identifiers and regulatory charge breakdowns are fields that don't exist in a standard invoice model. Generic extractors return amount and due date. This returns the data your pipeline actually needs.
Document Classification

Document Classification

Every document is classified before extraction: utility bill or not, and which type: electricity, gas, water or telecom. Mixed-document pipelines get a reliable gate. Anything that isn't a utility bill is flagged upfront, not silently misprocessed.
Schema & Arithmetic Validation

Schema & Arithmetic Validation

Every record is validated against your schema rules and the bill's own arithmetic before delivery. Line items are checked against the total. Low-confidence fields are flagged, not silently passed through to your ERP.
Automatic Feedback Loop

Automatic Feedback Loop

Operator corrections feed directly back into the model. Accuracy on new providers and layouts improves automatically, with no manual retraining and no engineering work. Most customers reach their target threshold within days of going live.

Frequently asked questions.

~/invofox / faq.json

// questions 11

1 {

2 ··"question": "Will it work with my regional or municipal utility provider?",

3

4 ··"answer": "Yes. The model is trained on genuine layout diversity across dozens of issuers, including small municipal providers and rural co-ops. Accuracy on a brand-new issuer starts meaningfully higher than a model trained on a curated subset, and any issuer-specific gaps close within days through the automatic feedback loop."

5 }

Coverage longtail.json
1 {

2 ··"question": "Can you identify multiple supplies on the same document?",

3

4 ··"answer": "Yes. When a single bill contains several supply points (electricity and gas together, multiple meters on the same page, or a consolidated portfolio bill) we detect each supply automatically and produce one structured record per supply with its own period, consumption, meter reference and charges. No flat object that conflates data across meters or fuel types."

5 }

Multi-supply multisupply.json
1 {

2 ··"question": "How do you distinguish estimated vs measured readings?",

3

4 ··"answer": "Every consumption figure is extracted with its `reading_type` field: `measured`, `estimated`, `self_reported` or `calculated`. This distinction is critical: an estimated reading is not equivalent to a measured one for Scope 2 carbon reporting, solar system sizing or bill audits. Generic OCR returns the number without its reliability context, which makes downstream data unsafe."

5 }

Reading type reading.json
1 {

2 ··"question": "Does it support time-of-use (TOU) tariffs and peak/off-peak periods?",

3

4 ··"answer": "Yes. TOU electricity contracts split consumption into 2, 3 or more tariff bands (peak, off-peak, shoulder), each with a different unit rate and a separately measured consumption figure. Each band is extracted as a structured row, not collapsed into a total. Gas tiered rates and water excess-use surcharges are handled with the same structure."

5 }

Tariffs tou.json
1 {

2 ··"question": "Can I configure my own schema fields?",

3

4 ··"answer": "Yes. Fields, validation rules and confidence thresholds are fully configurable per customer. ESG reports, AP automation, audit and anomaly detection each need different fields, and you can set up the schema that matches your workflow without engineering on our side."

5 }

Schema schema.json
1 {

2 ··"question": "How long until accuracy is production-ready for new issuers?",

3

4 ··"answer": "Day 1 is strong on common fields (provider, dates, amount due, account number). For issuer-specific quirks, the automatic feedback loop closes the gap in days, not weeks: operator corrections on low-confidence fields feed back as training signal and accuracy on that issuer improves over the next batches."

5 }

Production ramp.json
1 {

2 ··"question": "How is the API integration?",

3

4 ··"answer": "REST API with webhook delivery for async results. A typical integration (document upload, webhook receiver, schema configuration) takes less than a day for an engineer familiar with the stack. Full reference and code samples in the API documentation, plus a sandbox environment for testing before going live."

5 }

API api.json
1 {

2 ··"question": "How is data security handled?",

3

4 ··"answer": "SOC 2 Type II, ISO 27001 and GDPR compliant. Documents encrypted in transit (TLS 1.3) and at rest (AES-256). Zero-retention mode available for regulated workflows: documents and extracted data are purged on delivery, nothing persists on our infrastructure. Full compliance reports and certificates available on the Trust Center."

5 }

Security security.json
1 {

2 ··"question": "What languages and countries are supported?",

3

4 ··"answer": "Multilingual and locale-aware out of the box. Date formats, currencies, tax IDs, decimal separators, account number formats and regulatory charge labels work across major regions (US, EU, UK, LATAM). New countries are typically supported with no model changes, only schema adjustments for local fields."

5 }

Languages lang.json
1 {

2 ··"question": "What makes utility bill OCR different from invoice OCR?",

3

4 ··"answer": "Utility bills look like invoices on the surface but carry domain-specific complexity: consumption measurements, reading types (measured/estimated/calculated), tariff periods (TOU bands), supply-point identifiers and regulatory charges. Dropping a utility bill into an invoice extractor produces partial output at best and silent errors at worst."

5 }

Concept different.json
1 {

2 ··"question": "What's the pricing model?",

3

4 ··"answer": "Usage-based and predictable. No per-template fees, no setup costs. Pricing scales linearly with volume and enterprise tiers are available for high-volume teams. The 500 free pages cover an initial test run. See the pricing page for the full breakdown."

5 }

Pricing pricing.json

longtail.json

1 {

2 ··"question": "Will it work with my regional or municipal utility provider?",

4 ··"answer": "Yes. The model is trained on genuine layout diversity across dozens of issuers, including small municipal providers and rural co-ops. Accuracy on a brand-new issuer starts meaningfully higher than a model trained on a curated subset, and any issuer-specific gaps close within days through the automatic feedback loop."

5 }

Coverage longtail.json

Want the full technical deep dive? Read the blog post

Drop one utility bill. Get every field that matters.

Reading types, multi-supply, TOU tariffs and full charge breakdowns. Typed, structured and ready to flow downstream. 500 free pages to test on your own documents, no credit card required.