/plushcap/analysis/gretel-ai/innovating-with-fasttext-and-table-headers

Innovating With FastText and Table Headers

What's this blog post about?

FastText word embeddings can be used to quickly understand new datasets and build more consistent labels for structured data such as tables, JSON, or CSV files. The technique involves using a pre-trained FastText model based on schema examples from large collections of data. This approach helps in finding synonyms, abbreviations, and other variations of field headers, which can be useful when designing new table schemas or assessing the joinability of two tables. Additionally, it can aid in enforcing standardization policies across multiple internal data sources by comparing header suggestions with company standards.

Company
Gretel.ai

Date published
Aug. 20, 2020

Author(s)
Amy Steier

Word count
974

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.