Asynchronous Processing

Process documents asynchronously for large files or batch operations. The async API allows you to submit jobs and receive results via webhooks or polling.

When to Use Async Processing

Large documents that take longer to process
Batch processing of multiple documents
Integration with background job systems
Webhook-based workflows
Need to retrieve results multiple times or later

Not sure which to choose? See the Processing Overview for a detailed comparison.

How It Works

Submit Job: Upload document and create async job
Job Processing: Document is queued and processed in background
Get Results: Receive results via webhook or poll the job status
Retrieve Data: Access the processed data and download results

Endpoints

Submit Async Job

Create a new async processing job.

POST /v1/documents/parse/async

Authentication

Include your API key in the request header:

x-api-key: your_api_key_here

Request Body

Send as multipart/form-data with these fields:

Field	Type	Description	Required
file	File	Document to analyze (PDF, PNG, JPEG)	Yes
templateId	string (UUID)	ID of existing template	Yes
language	string	Language hint for OCR (optional)	No

Single File Only

Async processing currently supports one file per job. For multiple files, submit separate jobs.

Idempotency

Parselyze automatically handles duplicate submissions to prevent redundant processing:

Same File Detection: If you submit the exact same file with the same template, the API returns the existing job instead of creating a new one
File Comparison: Files are compared using content fingerprinting, not file names
Automatic Deduplication: No configuration needed - idempotency is built-in

Avoid Redundant Processing

If you accidentally submit the same document twice, you'll receive the original job ID and its current status. This prevents duplicate charges and unnecessary processing.

Example Request

curl -X POST https://api.parselyze.com/v1/documents/parse/async \
  -H "x-api-key: plz_xxxxxxxx...xxxxxx" \
  -F "file=@large_document.pdf" \
  -F "templateId=YOUR_TEMPLATE_ID"

Response

New Job:

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "message": "Job queued for processing",
  "createdAt": "2026-01-27T10:30:00Z"
}

Existing Job (Duplicate File):

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "message": "This document has already been submitted. Returning existing job.",
  "createdAt": "2026-01-27T10:28:15Z"
}

Idempotency

When you submit the exact same file with the same template, you always get back the original job. Use GET /v1/jobs/:jobId to get the full job details including the result.

Status Values:

pending: Job is queued and waiting to be processed
processing: Job is currently being processed
completed: Job finished successfully
failed: Job failed after all retries
retrying: Job failed and will be retried

Get Job Status

Retrieve the current status and result of a job.

GET /v1/jobs/:jobId

Example Request

curl -X GET https://api.parselyze.com/v1/jobs/550e8400-e29b-41d4-a716-446655440000 \
  -H "x-api-key: plz_xxxxxxxx...xxxxxx"

Response - Processing

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "fileName": "large_document.pdf",
  "templateId": "YOUR_TEMPLATE_ID",
  "result": null,
  "error": null,
  "pageCount": null,
  "attempts": 1,
  "createdAt": "2026-01-27T10:30:00Z",
  "startedAt": "2026-01-27T10:30:05Z",
  "completedAt": null
}

Response - Completed

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "fileName": "large_document.pdf",
  "templateId": "YOUR_TEMPLATE_ID",
  "result": {
    "invoice": {
      "number": "INV-2025-001",
      "date": "26/05/2025",
      "total": 1250.75,
      "vendor": "Acme Corporation"
    }
  },
  "error": null,
  "pageCount": 5,
  "attempts": 1,
  "createdAt": "2026-01-27T10:30:00Z",
  "startedAt": "2026-01-27T10:30:05Z",
  "completedAt": "2026-01-27T10:30:45Z"
}

Response - Failed

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "failed",
  "fileName": "large_document.pdf",
  "templateId": "YOUR_TEMPLATE_ID",
  "result": null,
  "error": "Failed to extract text from document",
  "pageCount": null,
  "attempts": 3,
  "createdAt": "2026-01-27T10:30:00Z",
  "startedAt": "2026-01-27T10:30:05Z",
  "completedAt": "2026-01-27T10:32:15Z"
}

JavaScript/Node.js Examples

Submit and Poll for Results

import { Parselyze } from "parselyze";

const parselyze = new Parselyze(process.env.PARSELYZE_API_KEY);

async function processDocumentAsync() {
  // Submit job
  const job = await parselyze.documents.parseAsync({
    file: "./large_document.pdf",
    templateId: "YOUR_TEMPLATE_ID"
  });

  console.log(`Job submitted: ${job.jobId}`);

  // Poll for completion
  let result;
  while (true) {
    result = await parselyze.jobs.get(job.jobId);
    
    if (result.status === "completed") {
      console.log("Job completed:", result.result);
      break;
    } else if (result.status === "failed") {
      console.error("Job failed:", result.error);
      break;
    }
    
    console.log(`Status: ${result.status}, waiting...`);
    await new Promise(resolve => setTimeout(resolve, 5000)); // Wait 5s
  }
}

processDocumentAsync();

Python Examples

Submit and Poll

import requests
import time
import os

API_KEY = os.getenv('PARSELYZE_API_KEY')
BASE_URL = 'https://api.parselyze.com'

def process_async():
    # Submit job
    with open('./large_document.pdf', 'rb') as file:
        files = {'file': file}
        data = {'templateId': 'YOUR_TEMPLATE_ID'}
        
        response = requests.post(
            f"{BASE_URL}/v1/documents/parse/async",
            headers={"x-api-key": API_KEY},
            files=files,
            data=data
        )
    
    job = response.json()
    job_id = job['jobId']
    print(f"Job submitted: {job_id}")
    
    # Poll for completion
    while True:
        response = requests.get(
            f"{BASE_URL}/v1/jobs/{job_id}",
            headers={"x-api-key": API_KEY}
        )
        
        result = response.json()
        status = result['status']
        
        if status == 'completed':
            print("Job completed:", result['result'])
            break
        elif status == 'failed':
            print("Job failed:", result['error'])
            break
        
        print(f"Status: {status}, waiting...")
        time.sleep(5)

process_async()

Retry Policy

Jobs automatically retry on failure with exponential backoff:

Max Attempts: 3 (initial attempt + 2 retries)
Backoff Strategy: Exponential starting at 5 seconds
- 1st retry: after 5 seconds
- 2nd retry: after 25 seconds (5s × 5)
Retry Status: Job status changes to retrying during retry attempts
Attempts Counter: The attempts field in the job response shows how many attempts have been made
Final Status: failed after all retries exhausted

Smart Retry Logic

The system only retries on transient errors (network issues, temporary unavailability). Permanent errors (invalid file format, missing template) fail immediately without retrying.

Best Practices

Use Webhooks: Configure webhooks to receive automatic notifications when jobs complete - much more efficient than polling (see Webhook Guide)
Leverage Idempotency: Don't worry about duplicate submissions - the API automatically returns existing jobs for identical files
Handle Retries: Jobs may take longer than expected during retries, implement appropriate timeout logic
Store Job IDs: Save job IDs in your database for audit trails and later retrieval
Polling Interval: If not using webhooks, don't poll too frequently (recommended: every 5-10 seconds)
Handle Failures: Implement error handling for failed jobs and check the error field for details

Error Responses

Code	Error	Description
400	Bad Request	Invalid file format or missing required fields
401	Unauthorized	Invalid or missing API key
404	Not Found	Job or template not found
413	Payload Too Large	File exceeds 50MB limit
429	Too Many Requests	Rate limit exceeded

Rate Limits

Async job submission is subject to the same rate limits as synchronous processing:

Free Plan: 10 requests/minute
Starter Plan: 30 requests/minute
Pro Plan: 60 requests/minute
Business Plan: 60 requests/minute

How It Works​

Endpoints​

Submit Async Job​

Authentication​

Request Body​

Idempotency​

Example Request​

Response​

Get Job Status​

Example Request​

Response - Processing​

Response - Completed​

Response - Failed​

JavaScript/Node.js Examples​

Submit and Poll for Results​

Python Examples​

Submit and Poll​

Retry Policy​

Best Practices​

Error Responses​

Rate Limits​

See Also​

How It Works

Endpoints

Submit Async Job

Authentication

Request Body

Idempotency

Example Request

Response

Get Job Status

Example Request

Response - Processing

Response - Completed

Response - Failed

JavaScript/Node.js Examples

Submit and Poll for Results

Python Examples

Submit and Poll

Retry Policy

Best Practices

Error Responses

Rate Limits

See Also