Skip to content

Latest commit

 

History

History
218 lines (159 loc) · 6.19 KB

File metadata and controls

218 lines (159 loc) · 6.19 KB

ParserData API Examples

Production-ready examples for integrating the ParserData API to extract structured financial data from invoices, receipts, bank statements, and other financial documents.

ParserData is an Intelligent Document Processing (IDP) engine built specifically for financial workflows. It transforms PDFs and images into clean, validated, structured JSON ready for automation, analytics, and accounting systems.

🌐 Website: https://parserdata.com

📘 API Docs: https://parserdata.com/parserdata-api

What is ParserData?

ParserData extracts structured data from financial documents such as:

  • Invoices (line items, totals, taxes, dates)
  • Receipts
  • Bank statements
  • Financial reports
  • Custom financial document types

It returns normalized JSON aligned to your defined schema, not freeform OCR text.

How It Works

  • Upload a document (PDF, PNG, JPG, WebP).

  • Provide a structured prompt or schema.

  • Receive clean JSON with:

    • Typed fields
    • Consistent structure
    • Line-item arrays
    • Numeric values as numbers (not strings)

ParserData treats documents as structured data sources, not just text blobs.


Quick Start

Get an API key

Create an account at https://parserdata.com

Generate your API key from the dashboard.

Basic cURL example

curl -X POST "https://api.parserdata.com/v1/extract" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F 'prompt=Extract invoice number, invoice date, supplier name, total amount, and line items (description, quantity, unit price, net amount).' \
  -F 'options={"return_schema":false,"return_selected_fields":false}' \
  -F 'file=@invoice.pdf'

Example response

{
  "result": {
    "invoice_number": "INV-2024-001",
    "invoice_date": "2024-01-15",
    "total_amount": 1200.50,
    "line_items": [
      {
        "description": "Consulting services",
        "quantity": 2,
        "unit_price": 500.25,
        "line_total": 1000.50
      }
    ]
  }
}

Full Structured invoice example

Here’s a more detailed structured prompt example:

Extract the following fields from the invoice:

Invoice-level:
- invoice_number
- invoice_date (YYYY-MM-DD)
- payment_due_date
- supplier_name
- customer_name
- net_amount
- tax_amount
- total_amount

Line items:
- sku
- description
- quantity
- unit_price
- line_total

Return JSON in a structured format with numeric values as numbers.

Python Example

import os, json, mimetypes, requests

API_KEY = os.environ.get("PARSERDATA_API_KEY")
URL = "https://api.parserdata.com/v1/extract"
HEADERS = {"X-API-Key": API_KEY}

PROMPT = (
    "Extract invoice number, invoice date, supplier name, total amount, and line items "
    "(description, quantity, unit price, net amount)."
)

FILE_PATH = "invoice.pdf"  # or invoice.png / invoice.jpg
mime = mimetypes.guess_type(FILE_PATH)[0] or "application/octet-stream"

with open(FILE_PATH, "rb") as f:
    files = {"file": (os.path.basename(FILE_PATH), f, mime)}
    data = {
        "prompt": PROMPT,
        "options": json.dumps({"return_schema": False, "return_selected_fields": False}),
    }

    r = requests.post(URL, headers=HEADERS, files=files, data=data, timeout=300)
    print(json.dumps(r.json(), indent=2, ensure_ascii=False))

Node.js Example

import fs from "fs";
import fetch from "node-fetch";
import FormData from "form-data";

const apiKey = process.env.PARSERDATA_API_KEY || "YOUR_API_KEY";
const url = "https://api.parserdata.com/v1/extract";

const form = new FormData();
form.append("prompt", "Extract invoice number, invoice date, supplier name, total amount, and line items (description, quantity, unit price, net amount).");
form.append("options", JSON.stringify({ return_schema: false, return_selected_fields: false }));
form.append("file", fs.createReadStream("./invoice.pdf"));

const res = await fetch(url, {
  method: "POST",
  headers: { "X-API-Key": apiKey, ...form.getHeaders() },
  body: form,
});

console.log(await res.json());

Supported file types

  • PDF
  • PNG
  • JPG / JPEG
  • WebP

Typical use cases

  • Accounts payable automation
  • Invoice reconciliation
  • ERP ingestion pipelines
  • Accounting system integrations
  • Business intelligence pipelines
  • RAG pipelines for financial documents
  • AI agent tool usage

Integration patterns

ParserData can be used in:

  • n8n workflows
  • Zapier automations
  • Power Automate flows
  • Custom backend services
  • AI agent orchestration frameworks
  • ETL pipelines
  • Serverless functions

Error handling

The API returns standard HTTP status codes.

Common cases:

Symptom Plain-English explanation Fix
400 Bad Request The request is invalid. Parserdata requires either a file upload, a file_url, or file.content, and you must include either a prompt or a schema. Ensure you are sending a document (file, file_url, or file.content) and specifying what to extract using either prompt or schema.
401 Unauthorized Your X-API-Key was rejected. Re-copy your API key. Make sure there are no extra spaces and that you are sending it in the X-API-Key header.
402 Payment Required Your Parserdata credits are exhausted (e.g., current=0, required=1). Upgrade your plan or top up credits at https://parserdata.com. Remember: 1 credit = 1 page.
404 Not Found The API endpoint does not exist. Check that you are calling: https://api.parserdata.com/v1/extract
429 Too Many Requests You are hitting the rate limit. Wait briefly and retry. Reduce concurrency or batch size if processing many documents.
500 Internal Server Error Parserdata encountered an internal error. Retry the request shortly. If the issue persists, contact support support@parserdata.com.
503 Service Unavailable The service is temporarily unavailable. Wait and try again later.

Always log the full JSON error response for debugging.


Best practices

  • Always validate numeric fields in your application layer.
  • Use strict prompts to control schema output.
  • Normalize dates to ISO format (YYYY-MM-DD).
  • Store raw JSON alongside structured database records for traceability.
  • Implement retries with exponential backoff for production workloads.

License

MIT