Numbers
1.234,56 1,234.56 Compared within tolerance ranges. Trailing-zero, separator and currency-format mismatches don't break the match.
We evaluate, normalize and validate data extraction accuracy across millions of documents — using a transparent, ground-truth benchmarking process built for real-world evals.
Powering document extraction for teams at



Accurate benchmarking starts with an accurate baseline — what we call the ground truth. It defines the correct data for every field, so we can measure extraction accuracy objectively across your pipeline. When a customer shares their labeled data, we use it as the standard reference.
1{
2 "document_number": "INV-2024-1837",
3 "issued_at": "2024-08-14",
4 "tax_base": 1452.30,
5 "vat_rate": 0.21,
6 "total": 1757.28
7} 1{
2 "document_number": "INV-2024-1873",
3 "issued_at": "2024-08-14",
4 "tax_base": 1452.40,
5 "vat_rate": 0.21,
6 "total": 1757.28
7} Document data rarely looks identical, even when it's correct. Our evaluation logic adapts to each data type — so comparisons remain fair and consistent across the board.
1.234,56 1,234.56 Compared within tolerance ranges. Trailing-zero, separator and currency-format mismatches don't break the match.
14/08/24 2024-08-14 Normalized to ISO 8601. Time-zone differences and locale formats are reconciled automatically.
— false Account for missing or unchecked states. "—" is treated as false unless the schema forces otherwise.
[A, B, C] [C, A, B] Evaluated by content, not order — unless the order is business-critical for the use case.
INV-1873 INV-1873 Exact Acme Co. ACME CO Normalized C/. Mayor 1 Calle Mayor 1 Levenshtein 0.92 Exact, normalized and similarity-based (Levenshtein) matching depending on the field type.
Most IDP vendors only report field-level accuracy. Invofox goes further by measuring document-level accuracy too — because even a single wrong field can stop an automated workflow. We compute both: per-field for granular analytics, per-document for end-to-end reliability, plus custom validation rules per use case.
Every field must be correct for a document to count. The signal a downstream workflow can actually trust.
Field-level precision and recall across millions of extracted keys. Perfect for monitoring and dashboards.
Adding or removing a field can make old benchmarks impossible to compare. Invofox tracks schema versions and normalizes changes automatically — so your accuracy results stay valid over time. When new keys appear, we flag affected documents so you keep clear visibility into your evolving data model.
document_number issued_at tax_base total document_number issued_at tax_base total currency document_number issue_date tax_base total currency We believe accuracy metrics should be verifiable, not subjective. Every eval runs in-house with consistent parameters and transparent rules. Each customer receives both summary metrics and the raw data used to calculate them — no black boxes, no hidden assumptions.
Still have questions? Talk to us
When you visit websites, they may store or retrieve data in your browser. This storage is often necessary for the basic functionality of the website and may also be used for marketing, analytics and personalization. You can disable categories that are not strictly necessary — blocking them may affect your experience.
Read the full cookie policy →Required to enable basic website functionality. Cannot be disabled.
Used to deliver advertising that is more relevant to you, limit how often you see an ad and measure the effectiveness of campaigns.
Allow the site to remember choices you make (language, region, preferences) to provide a more personal experience.
Help us understand how the site is used and where improvements are needed. Don't identify individual visitors.