Markdown File Extractor

The Markdown Extractor plugin is designed to automatically parse Markdown files (.md) uploaded to Flatfile. Its primary purpose is to find and extract any tables present in the markdown content. For each table it discovers, the plugin creates a new Sheet within the Flatfile Space. The table’s header row is used to define the fields (columns) of the Sheet, and the subsequent rows are converted into records. This is useful for importing data that is documented or stored in Markdown files, such as in project readmes, wikis, or technical documentation.

Installation

Install the Markdown Extractor plugin using npm:

npm install @flatfile/plugin-markdown-extractor

Configuration & Parameters

The plugin accepts the following configuration options:

maxTables

Type: number
Default: Infinity
Description: Sets the maximum number of tables to extract from a single Markdown file. By default, it will extract all tables it finds.

errorHandling

Type: 'strict' | 'lenient'
Default: 'lenient'
Description: Controls how parsing errors are handled:
- 'strict': The plugin will throw an error and stop processing if it encounters a malformed table (e.g., a data row with a different number of columns than the header)
- 'lenient': The plugin will log a warning to the console, skip the problematic table, and continue processing the rest of the file

debug

Type: boolean
Default: false
Description: When set to true, enables verbose logging to the console. This is useful for troubleshooting issues with file parsing, as it shows the content being parsed, tables found, and the final normalized data.

Usage Examples

Basic Usage

import { listener } from '@flatfile/listener';
import { MarkdownExtractor } from '@flatfile/plugin-markdown-extractor';

export default function(listener) {
  listener.use(MarkdownExtractor());
}

Configuration Example

import { listener } from '@flatfile/listener';
import { MarkdownExtractor } from '@flatfile/plugin-markdown-extractor';

export default function(listener) {
  const options = {
    maxTables: 5,
    errorHandling: 'strict',
    debug: true
  };

  listener.use(MarkdownExtractor(options));
}

Direct Parser Usage

import * as fs from 'fs';
import { markdownParser } from '@flatfile/plugin-markdown-extractor';

const markdownContent = '| ID | Name |\n|----|------|\n| 1  | Test |';
const buffer = Buffer.from(markdownContent, 'utf-8');

const workbookData = markdownParser(buffer, { errorHandling: 'lenient' });

console.log(JSON.stringify(workbookData, null, 2));

API Reference

MarkdownExtractor(options?)

The main factory function used to create and configure the markdown extractor plugin for use with a Flatfile listener.

Parameters:

options (optional): Configuration settings object

Returns: An Extractor instance that can be passed to listener.use()

Example:

import { listener } from '@flatfile/listener';
import { MarkdownExtractor } from '@flatfile/plugin-markdown-extractor';

export default function(myListener) {
  // Use with default options
  myListener.use(MarkdownExtractor());

  // Use with custom options
  myListener.use(MarkdownExtractor({ maxTables: 1, errorHandling: 'strict' }));
}

markdownParser(buffer, options)

A low-level function that directly parses a Buffer containing markdown content into a Flatfile WorkbookCapture object.

Parameters:

buffer: A Node.js Buffer containing the UTF-8 encoded content of a markdown file
options: Configuration settings object

Returns: A WorkbookCapture object, which is a map of sheet names to sheet data

Example:

import { markdownParser } from '@flatfile/plugin-markdown-extractor';

const markdownContent = '| ID | Name |\n|----|------|\n| 1  | Test |';
const buffer = Buffer.from(markdownContent, 'utf-8');

const workbookData = markdownParser(buffer, { errorHandling: 'lenient' });
console.log(JSON.stringify(workbookData, null, 2));

Troubleshooting

Error Handling Example

This example demonstrates how the errorHandling option affects behavior when parsing a malformed table:

import { markdownParser } from '@flatfile/plugin-markdown-extractor';

// Table has a header with 2 columns but a data row with 3
const malformedContent = '| A | B |\n|---|---|\n| 1 | 2 | 3 |';
const buffer = Buffer.from(malformedContent, 'utf-8');

// 'strict' mode will throw an error
try {
  markdownParser(buffer, { errorHandling: 'strict' });
} catch (e) {
  console.error('Caught error in strict mode:', e.message);
  //-> Caught error in strict mode: Data row length does not match header row length
}

// 'lenient' mode will not throw but will log a warning and skip the table
const result = markdownParser(buffer, { errorHandling: 'lenient' });
console.log('Result in lenient mode:', result);
// (A warning is logged to the console)
//-> Result in lenient mode: {}

Debug Mode

To diagnose parsing issues, enable debug mode:

listener.use(MarkdownExtractor({ debug: true }));

Notes

Default Behavior

The plugin extracts all tables found in a markdown file by default (maxTables: Infinity)
Uses lenient error handling by default, skipping malformed tables and continuing processing
Debug logging is disabled by default

Special Considerations

This plugin is intended for use in a server-side Flatfile listener
It operates on files with the .md extension
For each table found in a markdown file, a new Sheet is created with auto-generated names (Table_1, Table_2, etc.)
The parser expects standard GitHub-flavored markdown table syntax
Highly complex or non-standard tables may not be parsed correctly

Error Handling Patterns

Lenient mode (default): Logs warnings for malformed tables and continues processing
Strict mode: Throws errors immediately when encountering malformed tables, failing the entire file processing

Troubleshooting Tips

Set debug: true to enable detailed logging for diagnosing parsing issues
Check console output for warnings about skipped tables in lenient mode
Ensure markdown tables follow standard GitHub-flavored markdown syntax
Verify that header and data rows have matching column counts

JSON Schema Converter Merge.dev Connection Plugin

On this page

Installation
Configuration & Parameters
maxTables
errorHandling
debug
Usage Examples
Basic Usage
Configuration Example
Direct Parser Usage
API Reference
MarkdownExtractor(options?)
markdownParser(buffer, options)
Troubleshooting
Error Handling Example
Debug Mode
Notes
Default Behavior
Special Considerations
Error Handling Patterns
Troubleshooting Tips

Getting Started

Core Concepts

Plugins

Advanced Guides

Embedding Flatfile

Reference

Installation

Configuration & Parameters

maxTables

errorHandling

debug

Usage Examples

Basic Usage

Configuration Example

Direct Parser Usage

API Reference

MarkdownExtractor(options?)

markdownParser(buffer, options)

Troubleshooting

Error Handling Example

Debug Mode

Notes

Default Behavior

Special Considerations

Error Handling Patterns

Troubleshooting Tips

Getting Started

Core Concepts

Plugins

Advanced Guides

Embedding Flatfile

Reference

​Installation

​Configuration & Parameters

​maxTables

​errorHandling

​debug

​Usage Examples

​Basic Usage

​Configuration Example

​Direct Parser Usage

​API Reference

​MarkdownExtractor(options?)

​markdownParser(buffer, options)

​Troubleshooting

​Error Handling Example

​Debug Mode

​Notes

​Default Behavior

​Special Considerations

​Error Handling Patterns

​Troubleshooting Tips

Installation

Configuration & Parameters

maxTables

errorHandling

debug

Usage Examples

Basic Usage

Configuration Example

Direct Parser Usage

API Reference

MarkdownExtractor(options?)

markdownParser(buffer, options)

Troubleshooting

Error Handling Example

Debug Mode

Notes

Default Behavior

Special Considerations

Error Handling Patterns

Troubleshooting Tips