Browse plugins

plugin-extract-html-table

A Flatfile plugin for extracting table data from HTML files

Install
npm i @flatfile/plugin-extract-html-table
Package:
@flatfile/plugin-extract-html-table 125 installs
Dependencies
@flatfile/util-extractor@^2.1.5, node-html-parser@^6.1.13

@flatfile/plugin-extract-html-table

This plugin provides HTML table extraction capabilities for Flatfile. It parses HTML files and extracts structured data from tables, handling complex layouts and nested tables.

Event Type: listener.on('file:created')

Supported File Types: .html

Features

  • Extracts table structure, including headers and cell data
  • Handles nested tables and complex table layouts
  • Handles colspan and rowspan attributes (configurable)
  • Supports nested tables up to a configurable depth
  • Converts extracted data into a structured format
  • Provides error handling for malformed HTML or table structures
  • Debug mode for detailed logging

Parameters

options - object - (optional)

  • handleColspan - boolean - (optional): Determines how to handle colspan. Default is true.
  • handleRowspan - boolean - (optional): Determines how to handle rowspan. Default is true.
  • maxDepth - number - (optional): Maximum depth for nested tables. Default is 3.
  • debug - boolean - (optional): Enables debug logging. Default is false.

API Calls

  • api.files.download
  • api.files.update

Usage

install

npm install @flatfile/plugin-extract-html-table

import

import { HTMLTableExtractor } from '@flatfile/plugin-extract-html-table';

listener.js

const listener = new FlatfileListener();

listener.use(
  HTMLTableExtractor({
    handleColspan: true,
    handleRowspan: true,
    maxDepth: 3,
    debug: false
  })
);