Skip to content New Introducing our Perfect Docs Guaranteed offer — 99%+ accuracy for high-volume teams. Limited spots available. Learn more
Comparison diagram between GPT-4o, Claude Sonnet 3.5 and Invofox API for document parsing
Comparison AI/Tech Explainer

Document Parsing using GPT-4o API vs Claude Sonnet 3.5 API vs Invofox API (with Code Samples)

Anmol Baranwal Anmol Baranwal 8 min read
Table of contents

Using GPT-4o (ChatGPT) API

OpenAI’s GPT-4o can extract structured information when prompted correctly, but unlike Invofox, it cannot directly read PDF files. Text must first be extracted using OCR tools like Tesseract or pdfplumber, then sent to GPT via API prompt.

You will need an OpenAI API key. Create a .env file with this convention:

OPENAI_API_KEY=your_api_key

Step 1: Set up your Python environment

Create a virtual environment to isolate dependencies:

# macOS / Linux:
python3 -m venv env
source env/bin/activate

# Windows:
python -m venv env
.\env\Scripts\activate

Step 2: Install required packages

Install the necessary libraries:

pip install pdfplumber openai python-dotenv

After installing, generate a requirements.txt file:

pip freeze > requirements.txt

Add a .gitignore to avoid committing the virtual environment.

Step 3: Extract text and parse with GPT-4o

Here’s the complete code in openai-main.py:

import pdfplumber
import openai
from dotenv import load_dotenv
import os

load_dotenv()
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + "\n"
    return text

def parse_invoice_with_openai(invoice_text):
    prompt = (
        "Extract the following fields from this invoice text and return as a JSON object:\n"
        "- Invoice Number\n"
        "- Invoice Date\n"
        "- Due Date\n"
        "- Invoice Status (e.g. unpaid/paid)\n"
        "- Sender Name and Email\n"
        "- Recipient Name and Email\n"
        "- Items (description, quantity, rate)\n"
        "- Total Amount\n"
        "- Memo\n\n"
        "Invoice Text:\n"
        f"{invoice_text}"
    )

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt},
        ],
        max_tokens=500,
        temperature=0,
    )
    return response.choices[0].message.content

if __name__ == "__main__":
    pdf_path = "invoice_sample.pdf"
    invoice_text = extract_text_from_pdf(pdf_path)
    parsed_data = parse_invoice_with_openai(invoice_text)
    print(parsed_data)

The extract_text_from_pdf function uses pdfplumber to read each page and concatenate text. The parse_invoice_with_openai function sends the text to GPT-4o asking for JSON output.

Step 4: Output

Running python openai-main.py produces JSON output with extracted fields:

{
  "Invoice Number": "2-7-25",
  "Invoice Date": "July 2, 2025",
  "Due Date": "Upon receipt",
  "Invoice Status": "UNPAID",
  "Sender Name and Email": {
    "Name": "Anmol Baranwal",
    "Email": "hi@anmolbaranwal.com"
  },
  "Recipient Name and Email": {
    "Name": "Anmol Baranwal",
    "Email": "anmolbaranwal09@gmail.com"
  },
  "Line Items": [
    { "Description": "Testing", "Quantity": 1, "Rate": "$50.00", "Total": "$50.00" },
    { "Description": "Development", "Quantity": 1, "Rate": "$100.00", "Total": "$100.00" },
    { "Description": "Blog", "Quantity": 1, "Rate": "$50.00", "Total": "$50.00" }
  ],
  "Subtotal": "$200.00",
  "Total Amount": "$200.00",
  "Memo or Notes": "Thank you! This is a sample invoice for testing document parsing with AI models."
}

Pros:

  • Easy to try and flexible
  • GPT-4 excels at logic and structured data extraction
  • Can correctly identify invoice fields and calculate totals

Cons:

  • Requires prompt engineering and output verification
  • JSON can be malformed or miss fields (hallucinations possible)
  • No built-in validation or confidence scores
  • Outputs vary by prompt style
  • Sending all text in prompts can be costly for large documents

Cost estimate for 1–2 page invoice extraction: $0.005–$0.018, depending on prompt detail. Response time: 1–30 seconds, subject to load spikes.

Using Claude 3.5 Sonnet API

Anthropic’s Claude 3.5 Sonnet can parse structured data from text when prompted correctly. Like GPT-4o, it cannot read PDF files directly via API, so text must be extracted first.

You will need an Anthropic API key:

ANTHROPIC_API_KEY=your_api_key

Step 1: Set up environment and install packages

Create a virtual environment:

# macOS / Linux:
python3 -m venv env
source env/bin/activate

# Windows:
python -m venv env
.\env\Scripts\activate

Install required libraries:

pip install pdfplumber anthropic python-dotenv

Then export to requirements.txt and add .gitignore.

Step 2: Extract text and parse with Claude 3.5 Sonnet

Create anthropic-main.py:

import pdfplumber
import anthropic
import os
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("ANTHROPIC_API_KEY")
client = anthropic.Anthropic(api_key=api_key)

print("API Key loaded:", api_key[:12], "...")

def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            text += page.extract_text() + "\n"
    return text

def parse_invoice_with_claude(invoice_text):
    prompt = (
        "Extract the following fields from this invoice text and return as a JSON object:\n"
        "- Invoice Number\n"
        "- Invoice Date\n"
        "- Due Date\n"
        "- Invoice Status (e.g. unpaid/paid)\n"
        "- Sender Name and Email\n"
        "- Recipient Name and Email\n"
        "- Items (description, quantity, rate)\n"
        "- Total Amount\n"
        "- Memo\n\n"
        "Invoice Text:\n"
        f"{invoice_text}"
    )

    response = client.messages.create(
        model="claude-3-5-sonnet-20240620",
        max_tokens=500,
        temperature=0,
        messages=[{"role": "user", "content": prompt}],
    )

    return response.content[0].text

if __name__ == "__main__":
    pdf_path = "invoice_sample.pdf"
    invoice_text = extract_text_from_pdf(pdf_path)
    parsed_data = parse_invoice_with_claude(invoice_text)
    print(parsed_data)

Step 3: Output

Running python anthropic-main.py produces:

{
  "invoiceNumber": "2-7-25",
  "invoiceDate": "July 2, 2025",
  "dueDate": "Upon receipt",
  "invoiceStatus": "UNPAID",
  "senderName": "Anmol Baranwal",
  "senderEmail": "hi@anmolbaranwal.com",
  "recipientName": "Anmol Baranwal",
  "recipientEmail": "anmolbaranwal09@gmail.com",
  "items": [
    { "description": "Testing", "quantity": 1, "rate": 50.00 },
    { "description": "Development", "quantity": 1, "rate": 100.00 },
    { "description": "Blog", "quantity": 1, "rate": 50.00 }
  ],
  "totalAmount": 200.00,
  "memo": "Thank you! This is a sample invoice for testing document parsing with AI models."
}

Pros:

  • Claude 3.5 is strong at understanding long text and formatting it cleanly
  • Can handle text and images in prompts
  • Sometimes handles unusual or long documents better than GPT-4

Cons:

  • Requires prompt engineering like GPT
  • Can miss fields or hallucinate values
  • Returns raw JSON text without validation
  • Must manually extract PDF text first

Cost estimate: $0.005–$0.018 per document. Response time: 200–300ms best case, up to 10+ seconds for larger prompts.

Using the Invofox API

Invofox is a Y Combinator-backed startup built specifically for document parsing. It uses specialized models tuned for invoices and other documents, unlike general-purpose LLMs.

Key features include:

Splitter: Automatically separates multiple documents in a single PDF (e.g., mixed invoices), grouping pages into logical sub-documents for better extraction.

Classifier: Pretrained AI model that detects document types (invoice, receipt, etc.) so each document is processed using the correct schema.

Advanced AI models with proprietary algorithms verify and autocomplete data.

Step 1: Sign up for the dashboard

Create an account to generate an API key for the Invofox dashboard.

Step 2: Creating the request in Postman

Once you have your API key, use Postman to send documents to Invofox’s /uploads endpoint.

1. Create a new request

  • Open Postman and create a collection with a new request
  • Set method to POST
  • Request endpoint: https://api.invofox.com/v1/ingest/uploads

2. Set the headers

Add these headers:

  • accept: application/json
  • x-api-key: your_invofox_api_key

Do not manually set Content-Type; Postman handles it automatically with form-data.

3. Add the body (form-data)

Switch to Body tab, select form-data and add:

  • key: files, type: file, value: upload your invoice PDF
  • key: info, type: text, value:
{
  "type": "6840c4511cbcc77119347248",
  "data": {
    "companyActsLike": "issuer"
  }
}

The data field is optional for custom metadata or parsing instructions.

4. Send the request

Clicking “Send” returns a response with details:

{
  "accountId": "683edb9d7ded4695232c4979",
  "environmentId": "683edb9d7ded4695232c497b",
  "importId": "68662d83c3a0849a86a6aa30",
  "files": [
    {
      "id": "68662d83c3a0849a86a6aa33",
      "filename": "invoice_sample.pdf",
      "documentId": "68662d83c3a0849a86a6aa34"
    }
  ]
}
  • importID: Batch ID for tracking multiple uploads
  • documentID: ID of the parsed document

Step 3: Get parsed document

Make a GET request to https://api.invofox.com/documents/{documentID} with headers:

  • accept: application/json
  • x-api-key: your_invofox_api_key

The response includes the original document image and extracted data with many additional fields compared to GPT or Claude outputs.

Python code using Invofox API

Install the requests library:

pip install requests

Create invofox-main.py:

import requests
import os
import json
from dotenv import load_dotenv
import time

load_dotenv()

API_BASE = "https://api.invofox.com"
API_KEY = os.getenv("INVOFOX_API_KEY")
PDF_PATH = "invoice_sample.pdf"

headers = {"accept": "application/json", "x-api-key": API_KEY}

with open(PDF_PATH, "rb") as f:
    files = {"files": f}
    info = {"type": "6840c4511cbcc77119347248", "data": {"companyActsLike": "issuer"}}
    data = {"info": json.dumps(info)}
    resp_upload = requests.post(
        f"{API_BASE}/v1/ingest/uploads", headers=headers, files=files, data=data
    )

upload_result = resp_upload.json()
print("Upload response:", upload_result)

import_id = upload_result.get("importId")
if not import_id:
    raise ValueError("Import ID not found in upload response.")

# wait a moment for processing
time.sleep(2)

resp_import = requests.get(f"{API_BASE}/v1/ingest/imports/{import_id}", headers=headers)
import_info = resp_import.json()
print("Import info:", import_info)

files_info = import_info.get("files", [])
if not files_info or not files_info[0].get("documentIds"):
    raise ValueError("Document IDs not found in import info.")

document_id = files_info[0]["documentIds"][0]
print("Document ID:", document_id)

time.sleep(20)

resp_get = requests.get(f"{API_BASE}/documents/{document_id}", headers=headers)
parsed_doc = resp_get.json()
print("Parsed Document Data:")
print(json.dumps(parsed_doc, indent=2))

The three main API endpoints used:

  1. POST /v1/ingest/uploads → Uploads a PDF with metadata, returns importId
  2. GET /v1/ingest/imports/{importId} → Retrieves documentIds from importId
  3. GET /documents/{documentId} → Retrieves fully parsed invoice data

Running python invofox-main.py returns the parsed document with all fields correctly extracted and validated.

Results & comparison

API call methods:

  • GPT-4o/Claude → Send text with prompt
  • Invofox → Use API or upload file (image/PDF) in bulk

Setup:

  • GPT/Claude → Requires prompt engineering code
  • Invofox → Minimal code, no prompt needed

Validation:

  • GPT/Claude → Manual verification required
  • Invofox → Built-in validation and confidence scores

Performance:

  • GPT/Claude → Limited by token/window size
  • Invofox → Handles multi-page docs via backend OCR and AI

Key findings:

  • ChatGPT (GPT-4o): Good at parsing known fields if prompted clearly. You get JSON but must parse/clean it. Errors occur if prompts are unclear.
  • Claude 3.5 (Sonnet): Similar to GPT-4. Handles invoice fields well, sometimes better at recognizing unfamiliar terms. Still requires prompt massaging.
  • Invofox API: Returns fully parsed invoice JSON out-of-the-box. All fields correctly extracted and validated with exactly the needed schema and no extra coding.

Comparison table

FeatureInvofoxGPT-4oClaude 3.5 Sonnet
No-code setup❌ (code/prompt required)❌ (code/prompt required)
Direct PDF/image upload❌ (text-only)❌ (text-only)
Custom document type support✅ (custom IDs)✅ (via prompt)✅ (via prompt)
Fixed JSON schema❌ (varies by prompt)❌ (varies by prompt)
Consistent field naming❌ (“Line Items” variants)
Built-in validation
Confidence scores
Auto-completion
Human review optional
Workflow orchestration✅ (API/hooks/dashboard)
Multi-language OCR❌ (needs external OCR)❌ (needs external OCR)
Multi-page document support
Processing speed✅ (fast, <5s)✅ (fastest)⚠️ (slower)

Cost & execution time benchmarks

ToolCost StructureTypical Cost (per doc)Typical Execution TimeNotes
InvofoxPer document, usage-based, no fixed feesCustom / Not public. (Free trial, then custom)<5s per documentBuilt for production, price via sales
GPT-4o$2.50/million input tokens, $10/million output$0.005–$0.015 (1,000–2,000 tokens)1–30s per request, can vary by loadPrice = (input + output tokens) × pricing
Claude 3.5 Sonnet$3/million input tokens, $15/million output$0.006–$0.018 (1,000–2,000 tokens)~200–300ms best-case, up to 10s+API is fast, but batch/huge prompts increase time

Consider ongoing costs of upgrading language models. Teams often need to benchmark new models, retest prompts, adjust schemas, and modify parsing logic when new versions release. These hidden maintenance costs are significant. With Invofox, no such requirement exists.

Bottom line

For quick experiments or one-off tasks, GPT-4 (ChatGPT API) or Claude Sonnet work reasonably well by crafting suitable prompts. They produce structured JSON output competently.

However, for reliable production-grade parsing of invoices or receipts, the Invofox API is superior. It’s specifically built for documents using advanced proprietary models and continual feedback loops.

Complete code is available in the GitHub Repository.