Skip to content New Introducing our Perfect Docs Guaranteed offer — 99%+ accuracy for high-volume teams. Limited spots available. Learn more

Proof, not promises.

During a proof of concept, Invofox delivers a detailed performance report showing exactly how your documents perform — across accuracy, errors, vendors, processing times, precision and real-world edge cases.

Performance reports — Accuracy & volume metrics — Invofox

Powering document extraction for teams at

How we turn accuracy into evidence.

From the first documents processed, accuracy is never a vague promise — it's a defined, measured outcome.

  1. Sample + ground truth

    You share a small representative sample of your documents with the corresponding ground truth for evaluation.

  2. Define success upfront

    We analyze the sample and schema, split it into parts, and set targets: accuracy thresholds, confidence levels and evaluation criteria.

  3. Continuous improvement loop

    We iterate to refine the pipeline, adjust models and fine-tune the solution for maximum accuracy.

  4. Process the full volume

    We run thousands of documents and pages — not cherry-picked examples — through the production pipeline.

  5. Deliver the performance report

    A clear, visual breakdown of results you can confidently share with engineering, ops and exec stakeholders.

A report built for real decisions.

Designed to be shared internally — with technical teams, operators and executives — without additional explanation. Every report includes:

· Performance Report
8,432 docs 47 vendors Q3 2025
// overall accuracy
99.4% +0.7pp vs Q2
field-level · ground-truth validated
Q1 Q2 Q3
// processing time P50 · P95 · P99 benchmarks
  • P50
    8.0s
  • P95
    10.5s
  • P99
    12.0s
// error distribution Failure types
100%
  • Missing info 38%
  • Ambiguous 27%
  • Low confidence 22%
  • Edge case 13%
// page-level · document-level · splitter · classifier Performance by granularity
  • Pages 99.2%
  • Documents 99.4%
  • Splits 98.7%
  • Classify 99.1%
Methodology: ground-truth comparison · Q3 2025 dataset · static snapshot
  • Overall accuracy (field-level and document-level)
  • Processing time benchmarks (P95 / P99)
  • Error distribution and failure types
  • Missing or low-confidence fields
  • Confidence thresholds and warnings
  • Page-level and document-level performance
  • Splitter and classifier performance
  • Custom analysis relevant to your use case

All results are calculated using a defined evaluation methodology based on ground-truth comparisons — and shared transparently.

Understand performance at the level that matters.

Instead of a single average number, Invofox breaks results down in ways that reflect real production complexity — so your team focuses optimization where it has the greatest impact.

By document type

Invoices, contracts, tax forms (W-9), bills of lading, mortgage and loan applications, credit notes, delivery notes…

95% 100%
  • Invoices
    99.6%
  • Contracts
    97.2%
  • W-9 forms
    99.9%
  • Bills of lading
    98.4%
  • Credit notes
    99.1%

By document source

Identify which layouts represent the most volume or variability — vendors for invoices, jurisdictions for standard forms, etc.

95% 100%
  • Vendor A · NL
    99.4%
  • Vendor B · US-CA
    98.1%
  • Vendor C · DE
    99.6%
  • Vendor D · ES
    97.8%
  • Vendor E · UK
    99.2%

Errors aren't hidden — they're explained.

In real production, a percentage of documents will always present issues: poor image quality, missing data, corrupted files, highly inconsistent layouts. Reaching 100% automation isn't feasible regardless of the model. In high-volume deployments roughly 5–10% of documents fall into this category due to data-quality constraints alone.

5–10% flagged for review in high-volume deployments
  • Missing information 38%

    Required fields not present in the source document.

  • Ambiguous layouts 27%

    Multiple plausible interpretations for the same value.

  • Low-confidence extractions 22%

    Confidence under threshold — flagged for review.

  • True edge cases 13%

    Genuinely unusual cases the model hasn't seen yet.

Each report tells you exactly what failed, why, how often — and where action will have the biggest impact: feedback, threshold tuning, additional data or pipeline adjustments.

Frequently asked questions.

~/invofox / faq.json
scope.json
1
2 ··"question" "What's included in a performance report?"
3
4 ··"answer" "Field-level and document-level accuracy, P95/P99 processing benchmarks, error distribution by failure type, missing or low-confidence fields, confidence thresholds, splitter/classifier performance and any custom analysis specific to your workflow."
5
Scope scope.json
main 0 errors 0 warnings UTF-8 LF JSON

Still have questions? Talk to us