Home / Companies / InfluxData / Blog / Post Details
Content Deep Dive

Designing a Parquet Catalog for InfluxDB IOx

Blog post from InfluxData

Post Details
Company
Date Published
Author
Paul Dix
Word Count
1,565
Company Posts That Month
9
Language
English
Hacker News Points
-
Post removed?
No
Summary

The article discusses the design of a Parquet Catalog for InfluxDB IOx, a new in-memory columnar database that uses object storage for persistence. It explains why existing catalog standards like Apache Hive, Delta Lake, and Apache Iceberg were not suitable for their needs and how they decided to implement their own design. The catalog is focused on tracking what exists in object storage and efficiently keeping track of schema and statistics information for the Parquet files that InfluxDB IOx writes to object storage. It also supports soft deletes, allowing users to delete data but have it still be around for some period if needed. The design borrows many concepts from these three projects and uses the Parquet metadata format in Apache Thrift to keep information about metadata and statistics.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Observability 1 479 132 48 -10%
Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.