The Markdown Extractor plugin is designed to automatically parse Markdown files (.md) uploaded to Flatfile. Its primary purpose is to find and extract any tables present in the markdown content. For each table it discovers, the plugin creates a new Sheet within the Flatfile Space. The table’s header row is used to define the fields (columns) of the Sheet, and the subsequent rows are converted into records. This is useful for importing data that is documented or stored in Markdown files, such as in project readmes, wikis, or technical documentation.

Installation

Install the Markdown Extractor plugin using npm:

npm install @flatfile/plugin-markdown-extractor

Configuration & Parameters

The plugin accepts the following configuration options:

maxTables

  • Type: number
  • Default: Infinity
  • Description: Sets the maximum number of tables to extract from a single Markdown file. By default, it will extract all tables it finds.

errorHandling

  • Type: 'strict' | 'lenient'
  • Default: 'lenient'
  • Description: Controls how parsing errors are handled:
    • 'strict': The plugin will throw an error and stop processing if it encounters a malformed table (e.g., a data row with a different number of columns than the header)
    • 'lenient': The plugin will log a warning to the console, skip the problematic table, and continue processing the rest of the file

debug

  • Type: boolean
  • Default: false
  • Description: When set to true, enables verbose logging to the console. This is useful for troubleshooting issues with file parsing, as it shows the content being parsed, tables found, and the final normalized data.

Usage Examples

Basic Usage

import { listener } from '@flatfile/listener';
import { MarkdownExtractor } from '@flatfile/plugin-markdown-extractor';

export default function(listener) {
  listener.use(MarkdownExtractor());
}

Configuration Example

import { listener } from '@flatfile/listener';
import { MarkdownExtractor } from '@flatfile/plugin-markdown-extractor';

export default function(listener) {
  const options = {
    maxTables: 5,
    errorHandling: 'strict',
    debug: true
  };

  listener.use(MarkdownExtractor(options));
}

Direct Parser Usage

import * as fs from 'fs';
import { markdownParser } from '@flatfile/plugin-markdown-extractor';

const markdownContent = '| ID | Name |\n|----|------|\n| 1  | Test |';
const buffer = Buffer.from(markdownContent, 'utf-8');

const workbookData = markdownParser(buffer, { errorHandling: 'lenient' });

console.log(JSON.stringify(workbookData, null, 2));

API Reference

MarkdownExtractor(options?)

The main factory function used to create and configure the markdown extractor plugin for use with a Flatfile listener.

Parameters:

  • options (optional): Configuration settings object

Returns: An Extractor instance that can be passed to listener.use()

Example:

import { listener } from '@flatfile/listener';
import { MarkdownExtractor } from '@flatfile/plugin-markdown-extractor';

export default function(myListener) {
  // Use with default options
  myListener.use(MarkdownExtractor());

  // Use with custom options
  myListener.use(MarkdownExtractor({ maxTables: 1, errorHandling: 'strict' }));
}

markdownParser(buffer, options)

A low-level function that directly parses a Buffer containing markdown content into a Flatfile WorkbookCapture object.

Parameters:

  • buffer: A Node.js Buffer containing the UTF-8 encoded content of a markdown file
  • options: Configuration settings object

Returns: A WorkbookCapture object, which is a map of sheet names to sheet data

Example:

import { markdownParser } from '@flatfile/plugin-markdown-extractor';

const markdownContent = '| ID | Name |\n|----|------|\n| 1  | Test |';
const buffer = Buffer.from(markdownContent, 'utf-8');

const workbookData = markdownParser(buffer, { errorHandling: 'lenient' });
console.log(JSON.stringify(workbookData, null, 2));

Troubleshooting

Error Handling Example

This example demonstrates how the errorHandling option affects behavior when parsing a malformed table:

import { markdownParser } from '@flatfile/plugin-markdown-extractor';

// Table has a header with 2 columns but a data row with 3
const malformedContent = '| A | B |\n|---|---|\n| 1 | 2 | 3 |';
const buffer = Buffer.from(malformedContent, 'utf-8');

// 'strict' mode will throw an error
try {
  markdownParser(buffer, { errorHandling: 'strict' });
} catch (e) {
  console.error('Caught error in strict mode:', e.message);
  //-> Caught error in strict mode: Data row length does not match header row length
}

// 'lenient' mode will not throw but will log a warning and skip the table
const result = markdownParser(buffer, { errorHandling: 'lenient' });
console.log('Result in lenient mode:', result);
// (A warning is logged to the console)
//-> Result in lenient mode: {}

Debug Mode

To diagnose parsing issues, enable debug mode:

listener.use(MarkdownExtractor({ debug: true }));

Notes

Default Behavior

  • The plugin extracts all tables found in a markdown file by default (maxTables: Infinity)
  • Uses lenient error handling by default, skipping malformed tables and continuing processing
  • Debug logging is disabled by default

Special Considerations

  • This plugin is intended for use in a server-side Flatfile listener
  • It operates on files with the .md extension
  • For each table found in a markdown file, a new Sheet is created with auto-generated names (Table_1, Table_2, etc.)
  • The parser expects standard GitHub-flavored markdown table syntax
  • Highly complex or non-standard tables may not be parsed correctly

Error Handling Patterns

  • Lenient mode (default): Logs warnings for malformed tables and continues processing
  • Strict mode: Throws errors immediately when encountering malformed tables, failing the entire file processing

Troubleshooting Tips

  • Set debug: true to enable detailed logging for diagnosing parsing issues
  • Check console output for warnings about skipped tables in lenient mode
  • Ensure markdown tables follow standard GitHub-flavored markdown syntax
  • Verify that header and data rows have matching column counts