Data Extractor

Extracts structured data from files using AI-powered processing.

Overview

The Data Extractor node uses Lido's AI extraction capabilities to pull structured data from documents. It can process PDFs, images, and other file types to extract tabular data.

Use it to:

Extract invoice line items
Parse financial statements
Convert document tables to structured data
Process scanned documents

Parameters

Parameter	Description	Required
Worksheet Name	Worksheet with extraction configuration	Yes
Source Type	Type of source to extract from (`File` or `Email`)	Yes
File	File to extract data from (when Source Type is File)	Conditional
Email	Email to extract data from (when Source Type is Email)	Conditional
Populate Worksheet	Write extracted data to the worksheet	No
Response Format	Output format (Array or Objects)	Yes
Split Rows as Items	Create separate items per row	No
Include Headers	Include column headers in array format	No
Lido Spreadsheet URL	Override the default spreadsheet	No

Worksheet Name

Select the worksheet that contains your extraction configuration. This defines the columns and structure of the extracted data.

Source Type

Choose whether to extract data from a File or an Email. Defaults to File.

File — Extract from a document (PDF, image, spreadsheet, etc.)
Email — Extract from an email message (including its attachments)

File

The file to extract data from. Visible when Source Type is File. Typically comes from a trigger or file operation node:

{{$item.data.file}}

Email

The email to extract data from. Visible when Source Type is Email. Typically comes from a Lido Mailbox Trigger or Outlook Trigger:

{{$item.data.email}}

Response Format

Format	Description
Array	Returns data as 2D array of values
Objects	Returns data as array of objects with column names as keys

Split Rows as Items

When enabled, each extracted row becomes a separate workflow item. When disabled, all rows are returned in a single item.

Output

The output contains a data array with extracted rows and a columns array listing the column names:

{
  "data": [
    {
      "Product": "Widget A",
      "Quantity": 10,
      "Price": 25.0
    },
    {
      "Product": "Widget B",
      "Quantity": 5,
      "Price": 50.0
    }
  ],
  "columns": ["Product", "Quantity", "Price"]
}

Access extracted data in expressions:

First row's product: {{$item.data.data[0].Product}}
All columns: {{$item.data.columns}}

With Split Rows enabled:

Each extracted row becomes a separate item, so the data is directly accessible:

{
  "Product": "Widget A",
  "Quantity": 10,
  "Price": 25.0
}

Access fields directly: {{$item.data.Product}}

Examples

Extract Invoice Items from Files

Process incoming invoices from Google Drive:

[Google Drive Trigger] → [Data Extractor] → [Insert Rows]

Connect Google Drive Trigger watching for new PDFs
Set Source Type to File
Set Worksheet to your invoice extraction config
Set File: {{$item.data.file}}
Enable Split Rows as Items
Connect to Insert Rows to save extracted line items

Extract Data from Emails

Process incoming emails with the Lido Mailbox Trigger:

[Lido Mailbox Trigger] → [Data Extractor] → [Insert Rows]

Connect Lido Mailbox Trigger to receive incoming emails
Set Source Type to Email
Set Email: {{$item.data.email}}
Set Worksheet to your extraction config
Enable Split Rows as Items

Batch Document Processing

[Google Drive Trigger] → [Data Extractor] → [Edit Item] → [Insert Rows]

Use Edit Item to add metadata like source file name before saving.

Extract Without Splitting

Get all data as a single item for aggregation:

Set Split Rows as Items: disabled
Use the rows array in downstream nodes

Tips

Configure extraction templates in your Lido spreadsheet first
Test extraction settings using the Data Extractor in the UI before automating
Use Split Rows when processing items individually downstream
The Objects format is easier to work with in most cases
Large documents may take longer to process
Connect error output to handle extraction failures

Overview​

Parameters​

Worksheet Name​

Source Type​

File​

Email​

Response Format​

Split Rows as Items​

Output​

Examples​

Extract Invoice Items from Files​

Extract Data from Emails​

Batch Document Processing​

Extract Without Splitting​

Tips​