This is legacy documentation maintained for backward compatibility. For the most up-to-date information, please refer to the current documentation.
In Flatfile, mapping describes the process of getting data from an uploaded
file into a destination worksheet in your application.
The Ghost of Mapping Past
Historically mapping in Flatfile has consisted of two pieces:
- A field mapping which describes which columns in the source data
corresponds to which fields in the destination worksheet. For example, maybe
the “fname” column in the source data should be mapped to the “First Name”
field in the destination worksheet.
- For every destination field of type enum, an
enum mapping describing which values in the source column correspond to
which allowable enum values in the destination column. For example, you could
have an enum column of country codes, and you might need the source value
“United States” to map to the enum value “US”.
Flatfile has developed machine-learning models trained on millions of historical
mappings that can suggest both field and enum mappings based on the column names
and values in the source data. You can of course override these suggestions. The
Flatfile platform remembers your previously chosen mappings and will suggest
them when it sees similar schemas / data in the future.
The Ghost of Mapping Present
Many users of Flatfile need more control over the mapping process than just
assigning source fields to destination fields. For example, you might need to
concatenate multiple source fields into a single destination field, or you might
need to choose the first non-empty value from a set of source fields.
To this end, we’ve recently introduced a wider variety of
mapping rules
that specify these more complex mapping-related transformations on your data,
and the notion of a mapping program that selectively applies these rules to
your dataset.
As of right now the only rule type supported by the Flatfile application is the
assign rule, as described above.
However, we have introduced a
Python SDK that
allows you to create your own mapping rules and mapping programs and apply them
to your own datasets outside of Flatfile (e.g. in an ETL pipeline). The SDK is
currently in extreme beta, but we would love to hear your feedback on it. It
supports the full complement of mapping rules, as described on its website.
If you have a Flatfile secret key, the SDK also exposes an “automapping” feature
that uses the previously-described machine learning models to suggest mapping
programs for a given source and destination schema. As with the Flatfile app,
this feature currently only provides assign rules (and also ignore rules
that don’t do anything but just communicate that a source field was not mapped).
The Ghost of Mapping Future
In the near future we will be expanding mapping in several ways:
Richer Mapping Rules in the Flatfile application
We are working on adding support for the full set of mapping rules in the
Flatfile application. This will allow a user mapping an incoming file to create
complex mapping programs, save them, and reuse them.
Richer AI Assist for Mapping Rules
Currently our machine learning algorithms only suggest “assign” rules. In the
future they will also be able to suggest more complex mapping rules; for
instance, if your incoming data had a “first name” and “last name” field, and
your destination sheet only a “full name” field, the AI might suggest a
“concatenate” rule that combines the two source fields into the destination
field.
In addition, we are working on a natural-language interface for creating mapping
rules; for instance, you could type “concatenate first name and last name into
full name” and the AI would produce the appropriate rule.
JavaScript SDK
We are working on a JavaScript SDK that has similar functionality to the Python
SDK, the most notable difference being that it can’t interact with Pandas
dataframes. It will be available soon.