One file in.
Many documents out.

Invofox splits and classifies mixed document streams automatically. No manual sorting. No preprocessing. No fragile upload rules.

Start for free Read the docs

bundle.pdf

Mortgage application
5 pages
Payslip
2 pages
Tax return
1 page
Bank statement
4 pages

Powering document extraction for teams at

Messy inputs break automation.

You don't control what gets uploaded — your users do. Most workflows don't start with one perfectly labeled file. Customers send whatever they have:

manual review required

Bundled PDFs Multiple documents stuffed into a single file.
Mixed types Invoices, payslips and BoLs arriving together.
Unknown layouts Forms in shapes you didn't anticipate.
Wrong order Pages out of sequence, with duplicates and orphans.

Before any extraction can happen, teams manually split, classify and route — slowing workflows and introducing errors.

Automatic separation. Intelligent identification.

Combining our splitter and classifier, Invofox helps you separate, identify and route documents — even when everything arrives bundled. No manual sorting or rigid upload rules.

Start for free

How teams use our splitter + classifier in production.

One uploaded file. Many embedded sub-documents. Fully automated processing.

Step 01
Ingest files

Upload single PDFs or batched files containing multiple documents.
Step 02
Detect document boundaries

Splitter analyzes page structure and identifies where each document begins and ends — even when layouts are unknown or inconsistent.
Step 03
Split the file

Once boundaries are detected, the file is split into independent PDFs — each one will subsequently be processed separately.
Step 04
Classify each document

Classifier automatically identifies document types based on content, layout and target schema.
Step 05
Route to the right workflow

Each document is processed using the appropriate extraction and validation logic.

A single upload, many document types.

In real workflows, one uploaded file may contain a mortgage application, multiple payslips, an invoice and a BoL. Splitter separates each document. Classifier identifies the type. Every document gets routed to the right extraction workflow.

splitter.run('bundle.pdf') 12 pages

// input One file

// output 4 documents detected

Mortgage application
5 pages
Payslip × 2
2 pages
Tax return
1 page
Bank statement
4 pages

Split. Classify. Extract. Route.

Four pieces of the same pipeline, designed to work together end to end.

01
Splitter

Automatically separates multi-document files into individual documents — even unstructured, inconsistently formatted ones.
02
Classifier

Identifies document types independently, eliminating manual labeling and fragile upload rules.
03
Extraction

Extracts required data from each document using the right logic for its type.
04
Routing

Routes extracted data to the correct downstream workflow — without changing how users upload.

Frequently asked questions.

~/invofox / faq.json

// questions 6

1 {

2 ··"question": "What file types can Splitter handle?",

3

4 ··"answer": "Single or multi-document PDFs, including bundles with mixed types, varying page orientations and inconsistent quality. Splitter analyzes page structure to detect boundaries even when layouts are unknown."

5 }

Files files.json
1 {

2 ··"question": "How are document boundaries detected?",

3

4 ··"answer": "Splitter uses AI models trained on real-world bundles to analyze visual layout, header patterns and content shifts between pages. No per-template configuration required."

5 }

Boundaries boundary.json
1 {

2 ··"question": "Do we need to train the Classifier?",

3

4 ··"answer": "Out of the box Classifier recognizes the most common business document types. For custom document types we offer per-tenant fine-tuning with a small sample set."

5 }

Classifier training.json
1 {

2 ··"question": "How is the result returned?",

3

4 ··"answer": "Both Splitter and Classifier return their results via the same asynchronous API — document IDs, page ranges, types and confidence scores per document, plus webhooks when processing completes."

5 }

API api.json
1 {

2 ··"question": "Can we use Splitter without Classifier (or vice versa)?",

3

4 ··"answer": "Yes. Both work standalone or together. Most teams use Splitter + Classifier + Extraction as one pipeline, but you can call any subset independently."

5 }

Pipeline extract.json
1 {

2 ··"question": "How does it scale on large bundles?",

3

4 ··"answer": "Splitter handles bundles with hundreds of pages — the pipeline is fully async, so your system gets the splitting result via webhook regardless of bundle size."

5 }

Scale scale.json

files.json

1 {

2 ··"question": "What file types can Splitter handle?",

4 ··"answer": "Single or multi-document PDFs, including bundles with mixed types, varying page orientations and inconsistent quality. Splitter analyzes page structure to detect boundaries even when layouts are unknown."

5 }

Files files.json

Still have questions? Talk to us

Automate document routing at scale.

Stop fixing documents before you can process them.

Start for free Book a demo

One file in.Many documents out.

Messy inputs break automation.

Automatic separation. Intelligent identification.

How teams use our splitter + classifier in production.

Ingest files

Detect document boundaries

Split the file

Classify each document

Route to the right workflow

A single upload, many document types.

Split. Classify. Extract. Route.

Splitter

Classifier

Extraction

Routing

Frequently asked questions.

Automate document routing at scale.

One file in.
Many documents out.