Document Classifier

Classifies documents into user-defined categories using AI.

Overview

The Document Classifier node uses AI to analyze a document and assign it to one of your predefined categories. It returns the chosen category and a confidence score. In Output Per Category mode, it can route items to different outputs based on the classification result.

Use it to:

Sort incoming invoices, contracts, and receipts by document type
Route documents to different processing pipelines based on content
Triage support documents by topic or department
Categorize scanned mail before further extraction

Parameters

Parameter	Description	Required
File	File to classify (supports expressions)	Yes
Page Range	Pages to analyze (e.g. "1-3")	No
Categories	List of category names and optional descriptions	Yes (min 2)
Instructions	Additional instructions for the classifier	No
Output Mode	Single Output or Output Per Category	Yes

File

The file to classify. Typically comes from a trigger or file operation node:

{{$item.data.file}}

Supports PDFs, images, and other document types.

Page Range

Limit classification to specific pages. Useful for large documents where the relevant content is on certain pages:

Example	Description
`1`	First page only
`1-3`	Pages 1 through 3
`1,3,5`	Specific pages

Field	Description	Required
Name	Category name (used in output and routing)	Yes
Description	Helps the AI understand what belongs in this category	No

Name	Description
Invoice	Bills and payment requests with line items and totals
Contract	Legal agreements, terms of service, NDAs
Receipt	Proof of payment, transaction confirmations

Instructions

Optional text giving the AI additional context for classification. Supports expressions and multiline input.

Examples:

"Focus on the document header to determine the type"
"If the document contains both an invoice and a receipt, classify it as an invoice"
"Documents in Spanish should still be classified using the English category names"

Output Mode

Mode	Description
Single Output	All classified items go to the `main` output
Output Per Category	Each category becomes a separate output, routing items to the matching category

Settings

Setting	Description
Execution Mode	`Once per item` (default) or `Once`
Output Mode	How to output results when running once
Batch Size	Items to process concurrently (default 5)
Stop on Error	Stop workflow on failure

Output

Each classified item contains:

{
  "category": "Invoice",
  "confidence": 95
}

Access in expressions:

Category: {{$item.data.category}}
Confidence: {{$item.data.confidence}}

Output Per Category Mode

When Output Mode is set to Output Per Category, the node creates one output per category. Each item is routed to the output matching its classification result. The AI is constrained to only return one of your defined category names.

For example, with categories Invoice, Contract, and Receipt:

An item classified as "Invoice" goes to the Invoice output
An item classified as "Contract" goes to the Contract output
An item classified as "Receipt" goes to the Receipt output

Examples

Classify and Route Documents

Process different document types with specialized pipelines:

                         ┌─ Invoice ──→ [Data Extractor (invoices)]
[Google Drive Trigger] → [Document Classifier] ─┼─ Contract ─→ [Copy File (contracts folder)]
                         └─ Receipt ──→ [Data Extractor (receipts)]

Set Output Mode to Output Per Category
Define categories: Invoice, Contract, Receipt
Connect each output to the appropriate downstream node

Classify Then Filter by Confidence

Only process high-confidence classifications:

[Google Drive Trigger] → [Document Classifier] → [Filter (confidence > 80)] → [Insert Rows]

Set Output Mode to Single Output
Add a Filter node checking {{$item.data.confidence}} Greater Than 80

Triage Incoming Mail Attachments

[Lido Mailbox Trigger] → [Edit Item (extract attachment)] → [Document Classifier] → [Switch]

Extract the attachment file from the email
Classify the attachment
Use Switch or Output Per Category to route by document type

Classify with Custom Instructions

[OneDrive Trigger] → [Document Classifier] → [Insert Rows]

Set Instructions to guide classification:

These are medical documents. Classify based on the document header.
If a document contains both a lab report and a prescription, classify it as a lab report.

Tips

Add descriptions to categories for better accuracy
Use the Page Range parameter for large documents where the first page is sufficient for classification
The confidence score ranges from 0 to 100
Output Per Category mode is useful for building branching pipelines without a separate If/Switch node
Classification is a long-running operation processed on heavy executor pods
Connect the error output to handle classification failures gracefully

Overview​

Parameters​

File​

Page Range​

Categories​

Instructions​

Output Mode​

Settings​

Output​

Output Per Category Mode​

Examples​

Classify and Route Documents​

Classify Then Filter by Confidence​

Triage Incoming Mail Attachments​

Classify with Custom Instructions​

Tips​