Parse various Excel file formats (.xls, .xlsx, .xlsm, .xlsb, .xltx, .xltm) and extract structured data with support for header detection, merged cells, and hierarchical spreadsheets.
Use this file to discover all available pages before exploring further.
The Excel Extractor plugin is designed to parse various Excel file formats (.xls, .xlsx, .xlsm, .xlsb, .xltx, .xltm) and extract structured data from them. When a user uploads a supported Excel file, this plugin automatically processes it, detects headers, and transforms the sheet data into a format that Flatfile can use. It offers extensive configuration for handling complex Excel files, including options for header detection, processing merged cells, and cascading data in hierarchical spreadsheets. This plugin is intended to be used in a server-side listener within the Flatfile platform.
import { listener } from '@flatfile/listener';import { ExcelExtractor } from '@flatfile/plugin-xlsx-extractor';export default function (listener) { listener.use( ExcelExtractor({ headerDetectionOptions: { algorithm: 'specificRows', rowNumbers: [2], // 0-based index for the third row }, }) );}
Server-Side Only: This plugin is designed to run in a server-side environment and should be used within a Flatfile listener
Duplicate Headers: If a sheet contains duplicate column headers, the plugin automatically makes them unique by appending a suffix (e.g., ‘Name’, ‘Name_1’, ‘Name_2’)
Empty Headers: Empty header cells are renamed to ‘empty’ (e.g., ‘empty’, ‘empty_1’)
Trailing Empty Rows: The parser automatically trims any fully empty rows from the end of a sheet before processing
Memory Limitations: The plugin has built-in handling for extremely large files that can cause memory issues, throwing a user-friendly error when files are too large
Row Cascading: When cascadeRowValues is enabled, empty cells are filled with values from the cell above. The cascade resets on a completely blank row or a new value in the column
Header Cascading: When cascadeHeaderValues is enabled, empty header cells are filled with values from the cell to the left. The cascade resets on a blank column or a new value