Document Parsing: GPT-4o vs Claude 3.5 vs Invofox API

Table of contents

Using GPT-4o (ChatGPT) API
Using Claude 3.5 Sonnet API
Using the Invofox API
Results & comparison

Using GPT-4o (ChatGPT) API

OpenAI’s GPT-4o can extract structured information when prompted correctly, but unlike Invofox, it cannot directly read PDF files. Text must first be extracted using OCR tools like Tesseract or pdfplumber, then sent to GPT via API prompt.

You will need an OpenAI API key. Create a .env file with this convention:

OPENAI_API_KEY=your_api_key

Step 1: Set up your Python environment

Create a virtual environment to isolate dependencies:

# macOS / Linux:
python3 -m venv env
source env/bin/activate

# Windows:
python -m venv env
.\env\Scripts\activate

Step 2: Install required packages

Install the necessary libraries:

pip install pdfplumber openai python-dotenv

After installing, generate a requirements.txt file:

pip freeze > requirements.txt

Add a .gitignore to avoid committing the virtual environment.

Step 3: Extract text and parse with GPT-4o

Here’s the complete code in openai-main.py:

import pdfplumber
import openai
from dotenv import load_dotenv
import os

load_dotenv()
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + "\n"
    return text

def parse_invoice_with_openai(invoice_text):
    prompt = (
        "Extract the following fields from this invoice text and return as a JSON object:\n"
        "- Invoice Number\n"
        "- Invoice Date\n"
        "- Due Date\n"
        "- Invoice Status (e.g. unpaid/paid)\n"
        "- Sender Name and Email\n"
        "- Recipient Name and Email\n"
        "- Items (description, quantity, rate)\n"
        "- Total Amount\n"
        "- Memo\n\n"
        "Invoice Text:\n"
        f"{invoice_text}"
    )

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt},
        ],
        max_tokens=500,
        temperature=0,
    )
    return response.choices[0].message.content

if __name__ == "__main__":
    pdf_path = "invoice_sample.pdf"
    invoice_text = extract_text_from_pdf(pdf_path)
    parsed_data = parse_invoice_with_openai(invoice_text)
    print(parsed_data)

The extract_text_from_pdf function uses pdfplumber to read each page and concatenate text. The parse_invoice_with_openai function sends the text to GPT-4o asking for JSON output.

Step 4: Output

Running python openai-main.py produces JSON output with extracted fields:

{
  "Invoice Number": "2-7-25",
  "Invoice Date": "July 2, 2025",
  "Due Date": "Upon receipt",
  "Invoice Status": "UNPAID",
  "Sender Name and Email": {
    "Name": "Anmol Baranwal",
    "Email": "[email protected]"
  },
  "Recipient Name and Email": {
    "Name": "Anmol Baranwal",
    "Email": "[email protected]"
  },
  "Line Items": [
    { "Description": "Testing", "Quantity": 1, "Rate": "$50.00", "Total": "$50.00" },
    { "Description": "Development", "Quantity": 1, "Rate": "$100.00", "Total": "$100.00" },
    { "Description": "Blog", "Quantity": 1, "Rate": "$50.00", "Total": "$50.00" }
  ],
  "Subtotal": "$200.00",
  "Total Amount": "$200.00",
  "Memo or Notes": "Thank you! This is a sample invoice for testing document parsing with AI models."
}

Pros:

Easy to try and flexible
GPT-4 excels at logic and structured data extraction
Can correctly identify invoice fields and calculate totals

Cons:

Requires prompt engineering and output verification
JSON can be malformed or miss fields (hallucinations possible)
No built-in validation or confidence scores
Outputs vary by prompt style
Sending all text in prompts can be costly for large documents

Cost estimate for 1–2 page invoice extraction: $0.005–$0.018, depending on prompt detail. Response time: 1–30 seconds, subject to load spikes.

Using Claude 3.5 Sonnet API

Anthropic’s Claude 3.5 Sonnet can parse structured data from text when prompted correctly. Like GPT-4o, it cannot read PDF files directly via API, so text must be extracted first.

You will need an Anthropic API key:

ANTHROPIC_API_KEY=your_api_key

Step 1: Set up environment and install packages

Create a virtual environment:

# macOS / Linux:
python3 -m venv env
source env/bin/activate

# Windows:
python -m venv env
.\env\Scripts\activate

Install required libraries:

pip install pdfplumber anthropic python-dotenv

Then export to requirements.txt and add .gitignore.

Step 2: Extract text and parse with Claude 3.5 Sonnet

Create anthropic-main.py:

import pdfplumber
import anthropic
import os
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("ANTHROPIC_API_KEY")
client = anthropic.Anthropic(api_key=api_key)

print("API Key loaded:", api_key[:12], "...")

def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            text += page.extract_text() + "\n"
    return text

def parse_invoice_with_claude(invoice_text):
    prompt = (
        "Extract the following fields from this invoice text and return as a JSON object:\n"
        "- Invoice Number\n"
        "- Invoice Date\n"
        "- Due Date\n"
        "- Invoice Status (e.g. unpaid/paid)\n"
        "- Sender Name and Email\n"
        "- Recipient Name and Email\n"
        "- Items (description, quantity, rate)\n"
        "- Total Amount\n"
        "- Memo\n\n"
        "Invoice Text:\n"
        f"{invoice_text}"
    )

    response = client.messages.create(
        model="claude-3-5-sonnet-20240620",
        max_tokens=500,
        temperature=0,
        messages=[{"role": "user", "content": prompt}],
    )

    return response.content[0].text

if __name__ == "__main__":
    pdf_path = "invoice_sample.pdf"
    invoice_text = extract_text_from_pdf(pdf_path)
    parsed_data = parse_invoice_with_claude(invoice_text)
    print(parsed_data)

Step 3: Output

Running python anthropic-main.py produces:

{
  "invoiceNumber": "2-7-25",
  "invoiceDate": "July 2, 2025",
  "dueDate": "Upon receipt",
  "invoiceStatus": "UNPAID",
  "senderName": "Anmol Baranwal",
  "senderEmail": "[email protected]",
  "recipientName": "Anmol Baranwal",
  "recipientEmail": "[email protected]",
  "items": [
    { "description": "Testing", "quantity": 1, "rate": 50.00 },
    { "description": "Development", "quantity": 1, "rate": 100.00 },
    { "description": "Blog", "quantity": 1, "rate": 50.00 }
  ],
  "totalAmount": 200.00,
  "memo": "Thank you! This is a sample invoice for testing document parsing with AI models."
}

Pros:

Claude 3.5 is strong at understanding long text and formatting it cleanly
Can handle text and images in prompts
Sometimes handles unusual or long documents better than GPT-4

Cons:

Requires prompt engineering like GPT
Can miss fields or hallucinate values
Returns raw JSON text without validation
Must manually extract PDF text first

Cost estimate: $0.005–$0.018 per document. Response time: 200–300ms best case, up to 10+ seconds for larger prompts.

Using the Invofox API

Invofox is a Y Combinator-backed startup built specifically for document parsing. It uses specialized models tuned for invoices and other documents, unlike general-purpose LLMs.

Key features include:

Splitter: Automatically separates multiple documents in a single PDF (e.g., mixed invoices), grouping pages into logical sub-documents for better extraction.

Classifier: Pretrained AI model that detects document types (invoice, receipt, etc.) so each document is processed using the correct schema.

Advanced AI models with proprietary algorithms verify and autocomplete data.

Create an account to generate an API key for the Invofox dashboard.

Step 2: Creating the request in Postman

Once you have your API key, use Postman to send documents to Invofox’s /uploads endpoint.

1. Create a new request

Open Postman and create a collection with a new request
Set method to POST
Request endpoint: https://api.invofox.com/v1/ingest/uploads

2. Set the headers

Add these headers:

accept: application/json
x-api-key: your_invofox_api_key

Do not manually set Content-Type; Postman handles it automatically with form-data.

3. Add the body (form-data)

Switch to Body tab, select form-data and add:

key: files, type: file, value: upload your invoice PDF
key: info, type: text, value:

{
  "type": "6840c4511cbcc77119347248",
  "data": {
    "companyActsLike": "issuer"
  }
}

The data field is optional for custom metadata or parsing instructions.

4. Send the request

Clicking “Send” returns a response with details:

{
  "accountId": "683edb9d7ded4695232c4979",
  "environmentId": "683edb9d7ded4695232c497b",
  "importId": "68662d83c3a0849a86a6aa30",
  "files": [
    {
      "id": "68662d83c3a0849a86a6aa33",
      "filename": "invoice_sample.pdf",
      "documentId": "68662d83c3a0849a86a6aa34"
    }
  ]
}

importID: Batch ID for tracking multiple uploads
documentID: ID of the parsed document

Step 3: Get parsed document

Make a GET request to https://api.invofox.com/documents/{documentID} with headers:

accept: application/json
x-api-key: your_invofox_api_key

The response includes the original document image and extracted data with many additional fields compared to GPT or Claude outputs.

Python code using Invofox API

Install the requests library:

pip install requests

Create invofox-main.py:

import requests
import os
import json
from dotenv import load_dotenv
import time

load_dotenv()

API_BASE = "https://api.invofox.com"
API_KEY = os.getenv("INVOFOX_API_KEY")
PDF_PATH = "invoice_sample.pdf"

headers = {"accept": "application/json", "x-api-key": API_KEY}

with open(PDF_PATH, "rb") as f:
    files = {"files": f}
    info = {"type": "6840c4511cbcc77119347248", "data": {"companyActsLike": "issuer"}}
    data = {"info": json.dumps(info)}
    resp_upload = requests.post(
        f"{API_BASE}/v1/ingest/uploads", headers=headers, files=files, data=data
    )

upload_result = resp_upload.json()
print("Upload response:", upload_result)

import_id = upload_result.get("importId")
if not import_id:
    raise ValueError("Import ID not found in upload response.")

# wait a moment for processing
time.sleep(2)

resp_import = requests.get(f"{API_BASE}/v1/ingest/imports/{import_id}", headers=headers)
import_info = resp_import.json()
print("Import info:", import_info)

files_info = import_info.get("files", [])
if not files_info or not files_info[0].get("documentIds"):
    raise ValueError("Document IDs not found in import info.")

document_id = files_info[0]["documentIds"][0]
print("Document ID:", document_id)

time.sleep(20)

resp_get = requests.get(f"{API_BASE}/documents/{document_id}", headers=headers)
parsed_doc = resp_get.json()
print("Parsed Document Data:")
print(json.dumps(parsed_doc, indent=2))

The three main API endpoints used:

POST /v1/ingest/uploads → Uploads a PDF with metadata, returns importId
GET /v1/ingest/imports/{importId} → Retrieves documentIds from importId
GET /documents/{documentId} → Retrieves fully parsed invoice data

Running python invofox-main.py returns the parsed document with all fields correctly extracted and validated.

Results & comparison

API call methods:

GPT-4o/Claude → Send text with prompt
Invofox → Use API or upload file (image/PDF) in bulk

Setup:

GPT/Claude → Requires prompt engineering code
Invofox → Minimal code, no prompt needed

Validation:

GPT/Claude → Manual verification required
Invofox → Built-in validation and confidence scores

Performance:

GPT/Claude → Limited by token/window size
Invofox → Handles multi-page docs via backend OCR and AI

Key findings:

ChatGPT (GPT-4o): Good at parsing known fields if prompted clearly. You get JSON but must parse/clean it. Errors occur if prompts are unclear.
Claude 3.5 (Sonnet): Similar to GPT-4. Handles invoice fields well, sometimes better at recognizing unfamiliar terms. Still requires prompt massaging.
Invofox API: Returns fully parsed invoice JSON out-of-the-box. All fields correctly extracted and validated with exactly the needed schema and no extra coding.

Comparison table

Feature	Invofox	GPT-4o	Claude 3.5 Sonnet
No-code setup	✅	❌ (code/prompt required)	❌ (code/prompt required)
Direct PDF/image upload	✅	❌ (text-only)	❌ (text-only)
Custom document type support	✅ (custom IDs)	✅ (via prompt)	✅ (via prompt)
Fixed JSON schema	✅	❌ (varies by prompt)	❌ (varies by prompt)
Consistent field naming	✅	❌ (“Line Items” variants)	❌
Built-in validation	✅	❌	❌
Confidence scores	✅	❌	❌
Auto-completion	✅	❌	❌
Human review optional	✅	❌	❌
Workflow orchestration	✅ (API/hooks/dashboard)	❌	❌
Multi-language OCR	✅	❌ (needs external OCR)	❌ (needs external OCR)
Multi-page document support	✅	❌	❌
Processing speed	✅ (fast, <5s)	✅ (fastest)	⚠️ (slower)

Cost & execution time benchmarks

Tool	Cost Structure	Typical Cost (per doc)	Typical Execution Time	Notes
Invofox	Per document, usage-based, no fixed fees	Custom / Not public. (Free trial, then custom)	<5s per document	Built for production, price via sales
GPT-4o	$2.50/million input tokens, $10/million output	$0.005–$0.015 (1,000–2,000 tokens)	1–30s per request, can vary by load	Price = (input + output tokens) × pricing
Claude 3.5 Sonnet	$3/million input tokens, $15/million output	$0.006–$0.018 (1,000–2,000 tokens)	~200–300ms best-case, up to 10s+	API is fast, but batch/huge prompts increase time

Consider ongoing costs of upgrading language models. Teams often need to benchmark new models, retest prompts, adjust schemas, and modify parsing logic when new versions release. These hidden maintenance costs are significant. With Invofox, no such requirement exists.

Bottom line

For quick experiments or one-off tasks, GPT-4 (ChatGPT API) or Claude Sonnet work reasonably well by crafting suitable prompts. They produce structured JSON output competently.

However, for reliable production-grade parsing of invoices or receipts, the Invofox API is superior. It’s specifically built for documents using advanced proprietary models and continual feedback loops.

Complete code is available in the GitHub Repository.

Document Parsing: GPT-4o vs Claude 3.5 vs Invofox API

Using GPT-4o (ChatGPT) API

Step 1: Set up your Python environment

Step 2: Install required packages

Step 3: Extract text and parse with GPT-4o

Step 4: Output

Using Claude 3.5 Sonnet API

Step 1: Set up environment and install packages

Step 2: Extract text and parse with Claude 3.5 Sonnet

Step 3: Output

Using the Invofox API

Step 2: Creating the request in Postman

Step 3: Get parsed document

Python code using Invofox API

Results & comparison

Comparison table

Cost & execution time benchmarks

Bottom line

Start automating document workflows today.

Keep reading

Utility Bill OCR and Parsing: What Actually Makes It Hard

The Problems You'll Run Into Using Google Document AI

What Is Intelligent Document Processing (IDP)?

Document Parsing: GPT-4o vs Claude 3.5 vs Invofox API

Using GPT-4o (ChatGPT) API

Step 1: Set up your Python environment

Step 2: Install required packages

Step 3: Extract text and parse with GPT-4o

Step 4: Output

Using Claude 3.5 Sonnet API

Step 1: Set up environment and install packages

Step 2: Extract text and parse with Claude 3.5 Sonnet

Step 3: Output

Using the Invofox API

Step 1: Sign up for the dashboard

Step 2: Creating the request in Postman

Step 3: Get parsed document

Python code using Invofox API

Results & comparison

Comparison table

Cost & execution time benchmarks

Bottom line

Start automating document workflows today.

Keep reading

Utility Bill OCR and Parsing: What Actually Makes It Hard

The Problems You'll Run Into Using Google Document AI

What Is Intelligent Document Processing (IDP)?