Skip to main content

Webhooks

Receive real-time notifications when async jobs complete. Webhooks provide a more efficient alternative to polling for job status.

Why Use Webhooks?
  • Real-time: Instant notifications when jobs complete
  • Efficient: No need to poll the API repeatedly
  • Scalable: Handle high volumes of async jobs easily
  • Reliable: Automatic retries with exponential backoff

How Webhooks Work

  1. Configure Webhook: Set your webhook URL and secret in your account settings
  2. Submit Job: Create async jobs as usual
  3. Receive Notification: Your endpoint receives a POST request when the job completes
  4. Verify Signature: Validate the request using HMAC-SHA256
  5. Process Result: Handle the job result in your application

Webhook Configuration

Setting Up Your Webhook

Configure your webhook URL and secret in your account dashboard:

  1. Go to Account SettingsWebhooks
  2. Enter your Webhook URL (HTTP or HTTPS)
  3. Generate or set a Webhook Secret (used for signature verification)
  4. Save your configuration
HTTPS Recommended

For security in production, use HTTPS URLs. HTTP is acceptable but less secure.

Webhook Secret

The webhook secret is used to sign requests with HMAC-SHA256. This allows you to verify that requests actually come from Parselyze.

Best Practices:

  • Use a long, random string (32+ characters)
  • Store it securely (environment variables, secrets manager)
  • Never commit it to version control
  • Rotate periodically for security

Webhook Payload

When a job completes (successfully or fails), Parselyze sends a POST request to your webhook URL.

Headers

POST /your/webhook/endpoint
Content-Type: application/json
X-Webhook-Signature: abc123def456...
User-Agent: Parselyze-Webhook/1.0

Important Headers:

  • Content-Type: Always application/json
  • X-Webhook-Signature: HMAC-SHA256 signature for verification
  • User-Agent: Parselyze-Webhook/1.0

Payload Structure

{
"eventId": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"eventType": "document.completed",
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"result": {
"invoice": {
"number": "INV-2025-001",
"date": "26/05/2025",
"total": 1250.75,
"vendor": "Acme Corporation"
}
},
"pageCount": 5,
"pageUsed": 5,
"pageRemaining": 1495,
"timestamp": "2026-01-27T10:30:45.123Z"
}

Payload Fields

FieldTypeDescription
eventIdstringUnique webhook delivery attempt identifier (UUID) - changes on each retry
eventTypestringEvent type: document.completed or document.failed
jobIdstringUnique job identifier (UUID) - use this for idempotency
statusstringJob status: completed or failed
resultobject|nullExtracted data according to your template schema if successful, null if failed
errorstringError message (only if status: "failed")
pageCountnumberNumber of pages in the document (only if successful)
pageUsednumberNumber of pages used for processing this job (only if successful)
pageRemainingnumberNumber of pages remaining in your quota (only if successful)
timestampstringISO 8601 timestamp of when the job completed

Example Payloads

Successful Job

{
"eventId": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
"eventType": "document.completed",
"jobId": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"result": {
"invoice": {
"number": "INV-2025-001",
"total": 1250.75
}
},
"pageCount": 3,
"pageUsed": 3,
"pageRemaining": 1497,
"timestamp": "2026-01-27T10:30:45.123Z"
}

Failed Job

{
"eventId": "8d0f7780-8536-51f6-c938-668877662222",
"eventType": "document.failed",
"jobId": "660f9511-f3ac-52e5-b827-557766551111",
"status": "failed",
"result": null,
"error": "Failed to extract text from document",
"timestamp": "2026-01-27T10:32:15.456Z"
}

Signature Verification

All webhook requests include an X-Webhook-Signature header containing an HMAC-SHA256 signature. You MUST verify this signature to ensure the request is legitimate.

How to Verify

  1. Get the raw request body as a string
  2. Compute HMAC-SHA256 hash using your webhook secret
  3. Compare with the X-Webhook-Signature header
  4. Only process the request if signatures match

Code Examples

import express from 'express';
import { Parselyze } from 'parselyze';

const app = express();
app.use(express.json());

// Initialize SDK with webhook secret
const parselyze = new Parselyze(
process.env.PARSELYZE_API_KEY,
process.env.PARSELYZE_WEBHOOK_SECRET
);

app.post('/webhook', (req, res) => {
const signature = req.headers['x-webhook-signature'];

// Verify signature using parselyze SDK
if (!parselyze.webhooks.verifySignature(req.body, signature)) {
return res.status(401).send('Invalid signature');
}

// Process the event
const { eventType, jobId, status, result, error } = req.body;

if (eventType === 'document.completed') {
console.log('Job completed:', jobId);
console.log('Result:', result);
// Store result in database, trigger next step, etc.
} else if (eventType === 'document.failed') {
console.error('Job failed:', jobId, error);
// Handle error, notify user, retry, etc.
}

res.status(200).send('OK');
});

app.listen(3000, () => {
console.log('Webhook server listening on port 3000');
});

Python (Flask)

import hmac
import hashlib
from flask import Flask, request, jsonify

app = Flask(__name__)
WEBHOOK_SECRET = "your_webhook_secret"

@app.route('/webhook', methods=['POST'])
def handle_webhook():
signature = request.headers.get('X-Webhook-Signature')

if not signature:
return 'Missing signature', 400

# Compute signature
payload = request.get_data()
expected_signature = hmac.new(
key=WEBHOOK_SECRET.encode('utf-8'),
msg=payload,
digestmod=hashlib.sha256
).hexdigest()

if not hmac.compare_digest(signature, expected_signature):
return 'Invalid signature', 403

# Process event
event = request.json
print(f"Received event: {event.get('eventType')}")

return 'OK', 200

Retry Policy

If your webhook endpoint fails to respond or returns an error, Parselyze will automatically retry the delivery with a robust retry strategy:

  • Max Attempts: 3
  • Timeout: 10 seconds per attempt
  • Success Codes: 200-299 status codes
  • Retry Conditions: 5xx errors, 429 (Too Many Requests), network failures, timeouts
  • No Retry: 4xx errors (except 429) are not retried
Quick Recovery

The first retry happens after 30 seconds, the second after 1 minute. This gives your endpoint time to recover from transient issues while keeping the total retry window reasonable.

Respond Within 10 Seconds

Your webhook endpoint must respond within 10 seconds. For long processing tasks, acknowledge the webhook immediately (return 200) and process asynchronously.

Best Practices

1. Verify Every Request

Always verify the X-Webhook-Signature to prevent spoofed requests. Use the SDK's verifyWebhookSignature() function for built-in timing-safe comparison.

2. Respond Quickly

Acknowledge webhooks within 10 seconds. For long processing tasks, return a 200 response immediately and handle processing in a background job.

4. Log Everything

Log all webhook events for debugging.

5. Monitor Failures

Set up alerts for webhook delivery failures. Check your logs regularly.

Security Considerations

  1. Use HTTPS: Strongly recommended for production to prevent man-in-the-middle attacks
  2. Verify Signatures: Never skip signature verification
  3. Keep Secret Safe: Store webhook secret securely
  4. Rate Limiting: Implement rate limiting on your webhook endpoint
  5. Input Validation: Validate the payload structure before processing
  6. IP Allowlisting: (Optional) Only accept requests from Parselyze IPs

Troubleshooting

Webhook Not Received

  1. Check webhook URL is correct and accessible
  2. Check your server logs for incoming requests
  3. Test with RequestBin or ngrok
  4. Review firewall/security group settings

Signature Verification Fails

  1. Ensure you're using the raw request body (not parsed JSON)
  2. Verify webhook secret matches exactly (no extra spaces)
  3. Check HMAC algorithm is SHA256
  4. Log both signatures to compare

Timeouts

  1. Respond within 10 seconds
  2. Move heavy processing to background jobs immediately
  3. Check your server's response time and optimize slow endpoints
  4. Monitor webhook response times in your logs

Duplicate Webhooks

  1. Implement idempotency using jobId as a unique key
  2. Check database/cache before processing each webhook
  3. This is expected behavior - retries may cause duplicates
  4. With 3 retry attempts, you may receive up to 3 identical webhooks for the same job