Document Processing

Process documents using Optical Character Recognition (OCR) to extract structured data according to predefined templates.

Template Management

Templates must be created through the web interface. You cannot create or modify templates via API.

Premium Feature: Multi-document Processing

Processing multiple files in a single request or ZIP files requires a premium subscription.

Endpoint

POST /documents/parse

Authentication

Include your API key in the request header:

x-api-key: your_api_key_here

Request Body

Send as multipart/form-data with these fields:

Field	Type	Description	Required
files	File(s)	Document(s) to analyze (PDF, PNG, JPEG, ZIP)	Yes
templateId	string (UUID)	ID of existing template	Yes
language	string	Language hint for OCR (optional)	No

Language (Optional)

You can optionally specify a language hint to improve accuracy for specific languages. Generally not needed, but you can force it by providing the language parameter.

Supported languages include: en (English), fr (French), es (Spanish), de (German), it (Italian), pt (Portuguese), ru (Russian), zh (Chinese), ja (Japanese), ko (Korean), ar (Arabic), and many more.

Examples:

"en" - English documents
"fr" - French documents
"zh" - Chinese documents

Processing Types

The endpoint automatically detects and handles:

Single Document: One PDF or image file
Multiple Documents: Multiple files (premium only)
ZIP Archive: ZIP containing multiple documents (premium only)

Example Requests

Single Document

curl -X POST https://api.parselyze.com/documents/parse \
  -H "x-api-key: your_api_key_here" \
  -F "files=@invoice.pdf" \
  -F "templateId=<YOUR_TEMPLATE_ID>"

Single Document with Language Hint

curl -X POST https://api.parselyze.com/documents/parse \
  -H "x-api-key: your_api_key_here" \
  -F "files=@facture_francaise.pdf" \
  -F "templateId=<YOUR_TEMPLATE_ID>" \
  -F "language=fr"

Multiple Documents with Language Hint (Premium)

curl -X POST https://api.parselyze.com/documents/parse \
  -H "x-api-key: your_api_key_here" \
  -F "files=@document1.pdf" \
  -F "files=@document2.jpg" \
  -F "templateId=<YOUR_TEMPLATE_ID>" \
  -F "language=en"

JavaScript/Node.js Examples

import { Parselyze } from "parselyze";

const parselyze = new Parselyze("plz_xxxxxxxx...xxxxxx");

(async function () {
  console.log("Start parsing document...");

  const result = await parselyze.documents.parse({
    files: ["./invoice.pdf"],
    templateId: "<YOUR_TEMPLATE_ID>",
  });

  console.log("Parsing complete:", result);
})();

Python Examples

def analyze_multiple_documents():
    template_id = '<YOUR_TEMPLATE_ID>'

    files = [
        ('files', ('document1.pdf', open('document1.pdf', 'rb'), 'application/pdf')),
        ('files', ('document2.jpg', open('document2.jpg', 'rb'), 'image/jpeg'))
    ]
    
    data = {
        'templateId': template_id,
        'language': 'en'  # Optional
    }
    
    response = requests.post(
        "https://api.parselyze.com/documents/parse",
        headers={"x-api-key": os.getenv('PARSELYZE_API_KEY')},
        files=files,
        data=data
    )

    # Close files
    for _, (_, file_obj, _) in files:
        file_obj.close()

    result = response.json()
    print("Multi-document analysis:", result)

Response Format

Single Document

{
  "result": {
    "invoice": {
      "number": "INV-2025-001",
      "date": "26/05/2025",
      "total": 1250.75,
      "vendor": "Acme Corporation"
    }
  },
  "pageCount": 1,
  "pageUsed": 1,
  "pageRemaining": 999
}

Multiple Documents

{
  "results": [
    {
      "filename": "document1.pdf",
      "result": {
        "invoice": {
          "number": "INV-2025-001",
          "total": 1250.75
        }
      }
    },
    {
      "filename": "document2.jpg", 
      "result": {
        "invoice": {
          "number": "INV-2025-002", 
          "total": 850.50
        }
      }
    }
  ],
  "totalPageCount": 3,
  "pageUsed": 3,
  "pageRemaining": 997
}

Template Requirements

Templates must be created through the web interface before use. You cannot create templates via API.

Supported File Types

PDF: Multi-page documents supported
Images: PNG, JPEG formats
ZIP: Archive containing multiple documents (premium only)

Page Quota

Your monthly page quota is consumed based on:

PDF files: 1 page per physical page in the document
Image files: 1 page per image
ZIP files: Sum of all pages in contained documents

Error Responses

Code	Error	Description
400	Bad Request	Missing required fields or invalid file format
401	Unauthorized	Invalid or missing API key
403	Forbidden	Feature requires premium subscription or testing mode restricted
404	Not Found	Template not found or not accessible
413	Payload Too Large	File size exceeds limit (50MB)
429	Too Many Requests	Rate limit exceeded

Endpoint​

Authentication​

Request Body​

Language (Optional)​

Processing Types​

Example Requests​

Single Document​

Single Document with Language Hint​

Multiple Documents with Language Hint (Premium)​

JavaScript/Node.js Examples​

Python Examples​

Response Format​

Single Document​

Multiple Documents​

Template Requirements​

Supported File Types​

Page Quota​

Error Responses​

See Also​

Endpoint

Authentication

Request Body

Language (Optional)

Processing Types

Example Requests

Single Document

Single Document with Language Hint

Multiple Documents with Language Hint (Premium)

JavaScript/Node.js Examples

Python Examples

Response Format

Single Document

Multiple Documents

Template Requirements

Supported File Types

Page Quota

Error Responses

See Also