Skip to content New Introducing our Perfect Docs Guaranteed offer — 99%+ accuracy for high-volume teams. Limited spots available. Learn more

One file in.
Many documents out.

Invofox splits and classifies mixed document streams automatically. No manual sorting. No preprocessing. No fragile upload rules.

bundle.pdf
  • Mortgage application
    5 pages
  • Payslip
    2 pages
  • Tax return
    1 page
  • Bank statement
    4 pages

Powering document extraction for teams at

Messy inputs break automation.

You don't control what gets uploaded — your users do. Most workflows don't start with one perfectly labeled file. Customers send whatever they have:

manual review required
  • Bundled PDFs Multiple documents stuffed into a single file.
  • Mixed types Invoices, payslips and BoLs arriving together.
  • Unknown layouts Forms in shapes you didn't anticipate.
  • Wrong order Pages out of sequence, with duplicates and orphans.

Before any extraction can happen, teams manually split, classify and route — slowing workflows and introducing errors.

Automatic separation. Intelligent identification.

Combining our splitter and classifier, Invofox helps you separate, identify and route documents — even when everything arrives bundled. No manual sorting or rigid upload rules.

How teams use our splitter + classifier in production.

One uploaded file. Many embedded sub-documents. Fully automated processing.

  1. Step 01

    Ingest files

    Upload single PDFs or batched files containing multiple documents.

  2. Step 02

    Detect document boundaries

    Splitter analyzes page structure and identifies where each document begins and ends — even when layouts are unknown or inconsistent.

  3. Step 03

    Split the file

    Once boundaries are detected, the file is split into independent PDFs — each one will subsequently be processed separately.

  4. Step 04

    Classify each document

    Classifier automatically identifies document types based on content, layout and target schema.

  5. Step 05

    Route to the right workflow

    Each document is processed using the appropriate extraction and validation logic.

A single upload, many document types.

In real workflows, one uploaded file may contain a mortgage application, multiple payslips, an invoice and a BoL. Splitter separates each document. Classifier identifies the type. Every document gets routed to the right extraction workflow.

splitter.run('bundle.pdf') 12 pages
// input One file
1
2
3
4
5
6
7
8
9
10
11
12
// output 4 documents detected
  • Mortgage application
    5 pages
  • Payslip × 2
    2 pages
  • Tax return
    1 page
  • Bank statement
    4 pages

Split. Classify. Extract. Route.

Four pieces of the same pipeline, designed to work together end to end.

  • 01

    Splitter

    Automatically separates multi-document files into individual documents — even unstructured, inconsistently formatted ones.

  • 02

    Classifier

    Identifies document types independently, eliminating manual labeling and fragile upload rules.

  • 03

    Extraction

    Extracts required data from each document using the right logic for its type.

  • 04

    Routing

    Routes extracted data to the correct downstream workflow — without changing how users upload.

Frequently asked questions.

~/invofox / faq.json
files.json
1
2 ··"question" "What file types can Splitter handle?"
3
4 ··"answer" "Single or multi-document PDFs, including bundles with mixed types, varying page orientations and inconsistent quality. Splitter analyzes page structure to detect boundaries even when layouts are unknown."
5
Files files.json
main 0 errors 0 warnings UTF-8 LF JSON

Still have questions? Talk to us