Home > Docs > Async and Progress
The API client supports asynchronous document processing with task-based workflows, multiple progress tracking modes, and webhook integration.
| Consideration | Sync | Async |
|---|---|---|
| Response time | Blocks until complete | Returns task ID immediately |
| Large documents | May timeout | Handles any size |
| Progress tracking | Not available | WebSocket, HTTP polling, or hybrid |
| Batch processing | Sequential | Parallel with queue management |
| Result retrieval | Inline in response | Separate call (JSON or ZIP) |
Use async when:
- Documents are large or processing takes more than a few seconds
- You need progress updates
- You want to submit multiple documents and process them in parallel
Created by convertFileAsync, convertSourceAsync, or convertFileAsyncToZip:
const task = await client.convertFileAsync({
files: buffer,
filename: "document.pdf",
to_formats: ["md"],
});
console.log(task.taskId); // unique task identifier
console.log(task.status); // "pending" | "started" | "success" | "failure"
console.log(task.position); // queue positiontask.on("progress", (status: TaskStatusResponse) => {
console.log(status.task_status);
console.log(status.task_position);
console.log(status.task_meta);
});
task.on("complete", (result: ConvertDocumentResponse) => {
console.log(result.document.md_content);
});
task.on("error", (error: ProcessingError) => {
console.error(error.message);
});const finalStatus = await task.waitForCompletion();
// finalStatus.task_status === "success"const status = await task.poll();
// Check status.task_status and decide whether to poll againFor tasks submitted with target: { kind: "inbody" }:
const task = await client.convertSourceAsync({
sources: [{ kind: "http", url: "https://example.com/doc.pdf" }],
options: { to_formats: ["md"] },
target: { kind: "inbody" },
});
await task.waitForCompletion();
const result = await client.getTaskResult(task.taskId);
console.log(result.document.md_content);For tasks submitted with the default ZIP target:
const task = await client.convertFileAsync({
files: buffer,
filename: "document.pdf",
to_formats: ["md", "json"],
});
await task.waitForCompletion();
const zip = await client.getTaskResultFile(task.taskId);
if (zip.success && zip.fileStream) {
zip.fileStream.pipe(createWriteStream("./result.zip"));
}const task = await client.convertFileAsyncToZip({
files: buffer,
filename: "document.pdf",
to_formats: ["md"],
});Long-polls the task status endpoint:
const status = await client.pollTaskStatus(task.taskId, 100);
// waitSeconds controls the long-polling timeout (default from config)Returns TaskStatusResponse:
interface TaskStatusResponse {
task_id: string;
task_status: "pending" | "started" | "success" | "failure";
task_position?: number;
task_meta?: {
total_documents?: number;
processed_documents?: number;
[key: string]: unknown;
};
}const client = new Docling({
api: { baseUrl: "http://localhost:5001" },
waitSeconds: 100, // Long-polling wait time (seconds)
pollingRetries: 5, // Max retries for polling failures
});The SDK supports three progress tracking modes, configured at the client or per-call level.
Periodically polls the task status endpoint:
const client = new Docling({
api: { baseUrl: "http://localhost:5001" },
progress: {
method: "http",
httpPollInterval: 1000, // poll every 1 second
onProgress: (update) => {
console.log(update.stage, update.percentage);
},
},
});Real-time updates through a persistent WebSocket connection:
const client = new Docling({
api: { baseUrl: "http://localhost:5001" },
progress: {
method: "websocket",
websocketTimeout: 5000,
onProgress: (update) => {
console.log(update.stage, update.percentage);
},
},
});Tries WebSocket first, falls back to HTTP polling:
const client = new Docling({
api: { baseUrl: "http://localhost:5001" },
progress: {
method: "hybrid",
websocketTimeout: 5000,
httpPollInterval: 1000,
onProgress: (update) => {
console.log(update.stage, update.percentage);
},
onComplete: (result) => {
console.log("Processing complete");
},
onError: (error) => {
console.error("Processing failed:", error.message);
},
},
});Override the client-level progress config for a specific call:
const result = await client.convert(buffer, "doc.pdf", { to_formats: ["md"] }, {
method: "hybrid",
onProgress: (update) => console.log(update),
});interface ProgressUpdate {
stage: string;
percentage?: number;
message?: string;
taskId?: string;
position?: number;
status?: string;
timestamp: number;
source?: "websocket" | "http";
memoryUsage?: NodeJS.MemoryUsage;
uploadedBytes?: number;
totalBytes?: number;
bytesPerSecond?: number;
}The DoclingWebSocketClient can be used directly for custom integrations:
import { DoclingWebSocketClient } from "docling-sdk";
const ws = new DoclingWebSocketClient({
baseUrl: "http://localhost:5001",
timeout: 5000,
reconnectAttempts: 3,
reconnectDelay: 2000,
heartbeatInterval: 30000,
});| Event | Data | Description |
|---|---|---|
connecting |
string |
Connecting to server |
connected |
string |
Connection established |
disconnected |
{ code, reason } |
Connection closed |
error |
Error |
Connection error |
taskUpdate |
TaskStatusResponse |
Task status changed |
taskComplete |
TaskStatusResponse |
Task completed |
taskFailed |
TaskStatusResponse |
Task failed |
taskStarted |
TaskStatusResponse |
Task started processing |
progress |
progress data | Progress update |
reconnecting |
{ attempt, delay } |
Attempting reconnection |
reconnectFailed |
{ attempt, error } |
Reconnection failed |
| State | Description |
|---|---|
connecting |
WebSocket handshake in progress |
connected |
Connected and receiving updates |
disconnected |
Not connected |
error |
Connection error occurred |
The AsyncTaskManager handles task submission, polling, and event emission:
import { AsyncTaskManager } from "docling-sdk";
const taskManager = client.getTaskManager();
taskManager.on("status", ({ status, taskId }) => {
console.log(`Task ${taskId}: ${status}`);
});
taskManager.on("success", ({ taskId }) => {
console.log(`Task ${taskId} completed`);
});
taskManager.on("failure", ({ error, taskId }) => {
console.error(`Task ${taskId} failed: ${error}`);
});
taskManager.on("timeout", ({ taskId }) => {
console.warn(`Task ${taskId} timed out`);
});interface TaskOptions {
timeout?: number; // Max wait time in ms
pollInterval?: number; // Polling interval in ms
maxPolls?: number; // Max poll attempts
waitSeconds?: number; // Long-polling wait (seconds)
pollingRetries?: number; // Max retries for poll failures
}You can implement webhook notifications after task completion:
const task = await client.convertFileAsync({
files: buffer,
filename: "document.pdf",
to_formats: ["md"],
});
await task.waitForCompletion();
// Notify your webhook endpoint
await fetch(process.env.WEBHOOK_URL, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
task_id: task.taskId,
status: "success",
}),
});- API Client -- async methods on the API client
- Streaming -- streaming as an alternative to async
- Chunking -- async chunking tasks
- Configuration -- progress config options