Skip to main content
Version: v3

Best Practices

This guide expands on best practices when hosting Flatfile in your own cloud.


Kubernetes

Flatfile recommends using Kubernetes for durability and ease of deployment. The Flatfile application requires 3 deployments: API, worker, and app. The API and worker pods will share the same image.

Kubernetes Recommendations

  • Version 1.21 (1.22 not yet supported)
  • EKS launch template eks_c5_xlarge
  • API replicas: 2
  • Worker replicas: 2-6 (depending on size and frequency of uploads, more workers allows faster processing in parallel)
  • App replicas: 1
  • Recommended pod memory: 2GB
  • Recommended pod CPU: 1vcpu

PostgreSQL

When deploying to AWS, it is recommended to use the RDS managed database products. Flatfile strongly recommends using IAM authentication to connect to the database from our API. This requires setting the authentication mechanism database user for the application:

# using the default refinery user:
GRANT rds_iam TO refinery;

AWS Aurora Recommendations

  • Engine 11.13 or greater
  • Size db.r5.large

Redis

It is strongly recommended to deploy Redis with encryption and an AUTH token. When constructing the connection string with an auth token, it should take the form:

rediss://:[auth token]@host:6379

The : before the token is quite important - without it, the token is interpreted as a user name.

Elasticache Recommendations

  • Cluster mode OFF
  • Size cache.m5.xlarge
  • Engine 5.0.6
  • Mutli AZ

MySQL

The MySQL cluster is used to store processed data and as such requires more resources than the PG instance. The MySQL cluster requires a database to be created named ephemeral. Additionally, the application is only compatible with Aurora 3.x versions (MySQL 8.x).

Aurora Recommendations

  • Aurora 3.x (compatibility with MySQL 8.x)
  • Size db.r5.xlarge

CORS

CORS policies affect two parts of hosting Flatfile:

  • Object Storage (files uploaded to the platform are ingressed to S3/object storage)
  • Flatfile Portal - the embedded Portal product is opened in an iframe by the application it is embedded in

Flatfile images are shipped without explicit CORS policies for Portal; if implementing CORS on your ingress, ensure that all domains that the Portal is served from are whitelisted.

CORS in Object Storage

Object storage buckets should have a CORS policy that allows uploads from domains the Portal is deployed on as well as exposing the ETag header. The ETag header makes it possible to perform multi-part file uploads which is critical for reasonable processing of large files as well as serving customers who may not have access to high speed internet.

[
{
"AllowedHeaders": ["*"],
"AllowedMethods": ["GET", "PUT", "POST", "DELETE"],
"AllowedOrigins": ["*"],
"ExposeHeaders": ["ETag"]
}
]

Queue Observability

Flatfile ships with integrated support for a dashboard to view details on the queue and job processing which may be helpful in debugging. To enable the dashboard, set several environment variables for the API:

ENABLE_BULL_BOARD=true
BULL_BOARD_USERNAME=flatfile
BULL_BOARD_PASSWORD=flatfile

The dashboard will be available at [api host]/admin/bull

Route53 DNS Entries

Once the IRSA config is deployed and the ingress is running, go to Route53 in the console to add 2 records, one for the API and one for the App.

For each, create a new record:

  • A type record
  • Add the subdomain (app / api)
  • Check the “alias” toggle
  • Select “Alias to Application and Classic Load Balancer”
  • Choose the appropriate region
  • Select the load balancer created by TF

Health and Liveness Checks

Flatfile's API implements a health check that includes a database health check.

  • Health check endpoint: [api host]/health (e.g. https://api.us.flatfile.io/health)
  • Liveness check endpoint: [api host]/build_info.json (e.g. https://api.us.flatfile.io/build_info.json)