Document Processing
Process documents using Optical Character Recognition (OCR) to extract structured data according to predefined templates.
Templates must be created through the web interface. You cannot create or modify templates via API.
Processing multiple files in a single request or ZIP files requires a premium subscription.
Endpoint
POST /documents/parse
Authentication
Include your API key in the request header:
x-api-key: your_api_key_here
Request Body
Send as multipart/form-data
with these fields:
Field | Type | Description | Required |
---|---|---|---|
files | File(s) | Document(s) to analyze (PDF, PNG, JPEG, ZIP) | Yes |
templateId | string (UUID) | ID of existing template | Yes |
language | string | Language hint for OCR (optional) | No |
Language (Optional)
You can optionally specify a language hint to improve accuracy for specific languages. Generally not needed, but you can force it by providing the language parameter.
Supported languages include: en
(English), fr
(French), es
(Spanish), de
(German), it
(Italian), pt
(Portuguese), ru
(Russian), zh
(Chinese), ja
(Japanese), ko
(Korean), ar
(Arabic), and many more.
Examples:
"en"
- English documents"fr"
- French documents"zh"
- Chinese documents
Processing Types
The endpoint automatically detects and handles:
- Single Document: One PDF or image file
- Multiple Documents: Multiple files (premium only)
- ZIP Archive: ZIP containing multiple documents (premium only)
Example Requests
Single Document
curl -X POST https://api.parselyze.com/documents/parse \
-H "x-api-key: your_api_key_here" \
-F "files=@invoice.pdf" \
-F "templateId=<YOUR_TEMPLATE_ID>"
Single Document with Language Hint
curl -X POST https://api.parselyze.com/documents/parse \
-H "x-api-key: your_api_key_here" \
-F "files=@facture_francaise.pdf" \
-F "templateId=<YOUR_TEMPLATE_ID>" \
-F "language=fr"
Multiple Documents with Language Hint (Premium)
curl -X POST https://api.parselyze.com/documents/parse \
-H "x-api-key: your_api_key_here" \
-F "files=@document1.pdf" \
-F "files=@document2.jpg" \
-F "templateId=<YOUR_TEMPLATE_ID>" \
-F "language=en"
JavaScript/Node.js Examples
import { Parselyze } from "parselyze";
const parselyze = new Parselyze("plz_xxxxxxxx...xxxxxx");
(async function () {
console.log("Start parsing document...");
const result = await parselyze.documents.parse({
files: ["./invoice.pdf"],
templateId: "<YOUR_TEMPLATE_ID>",
});
console.log("Parsing complete:", result);
})();
Python Examples
def analyze_multiple_documents():
template_id = '<YOUR_TEMPLATE_ID>'
files = [
('files', ('document1.pdf', open('document1.pdf', 'rb'), 'application/pdf')),
('files', ('document2.jpg', open('document2.jpg', 'rb'), 'image/jpeg'))
]
data = {
'templateId': template_id,
'language': 'en' # Optional
}
response = requests.post(
"https://api.parselyze.com/documents/parse",
headers={"x-api-key": os.getenv('PARSELYZE_API_KEY')},
files=files,
data=data
)
# Close files
for _, (_, file_obj, _) in files:
file_obj.close()
result = response.json()
print("Multi-document analysis:", result)
Response Format
Single Document
{
"result": {
"invoice": {
"number": "INV-2025-001",
"date": "26/05/2025",
"total": 1250.75,
"vendor": "Acme Corporation"
}
},
"pageCount": 1,
"pageUsed": 1,
"pageRemaining": 999
}
Multiple Documents
{
"results": [
{
"filename": "document1.pdf",
"result": {
"invoice": {
"number": "INV-2025-001",
"total": 1250.75
}
}
},
{
"filename": "document2.jpg",
"result": {
"invoice": {
"number": "INV-2025-002",
"total": 850.50
}
}
}
],
"totalPageCount": 3,
"pageUsed": 3,
"pageRemaining": 997
}
Template Requirements
Templates must be created through the web interface before use. You cannot create templates via API.
Supported File Types
- PDF: Multi-page documents supported
- Images: PNG, JPEG formats
- ZIP: Archive containing multiple documents (premium only)
Page Quota
Your monthly page quota is consumed based on:
- PDF files: 1 page per physical page in the document
- Image files: 1 page per image
- ZIP files: Sum of all pages in contained documents
Error Responses
Code | Error | Description |
---|---|---|
400 | Bad Request | Missing required fields or invalid file format |
401 | Unauthorized | Invalid or missing API key |
403 | Forbidden | Feature requires premium subscription or testing mode restricted |
404 | Not Found | Template not found or not accessible |
413 | Payload Too Large | File size exceeds limit (50MB) |
429 | Too Many Requests | Rate limit exceeded |
See Also
- Template Creation Tutorial: Learn how to create templates
- Template Testing: Test templates in the web interface
- Template Overview: Understanding template structure
- Authentication: API key management
- Error Handling: Comprehensive error handling guide