How curvy is your data: an investigation into Hilbert curve sorting

Post Details

Company

Pydantic

Date Published

April 7, 2026

Author

-

Word Count

2,032

Company Posts That Month

13

Language

English

Hacker News Points

-

Post removed?

No

Source URL

pydantic.dev/articles/how-curvy-is-your-data

Summary

Fusionfire, the internal database supporting Logfire, experimented with implementing Hilbert curve sorting to optimize query performance by preserving locality across multiple columns and improving row group pruning, departing from the traditional lexicographic sort. Despite the theoretical advantages of Hilbert curves, which map multi-dimensional data into a single sort key to maintain proximity across dimensions, the experiment revealed a regression in query performance and row group pruning for Fusionfire's data, which exhibits extreme cardinality skew. The lexicographic sort, which orders columns by increasing cardinality, proved more effective for this specific data distribution, concentrating benefits on columns with fewer unique values, thus achieving tighter min/max ranges and better compression. While Hilbert sorting has been beneficial in systems with comparable column cardinalities and diverse query patterns, such as Databricks and Apache Hudi, the conditions of Fusionfire's data—characterized by a natural partition key and skewed cardinality—favored the existing lexicographic approach. The experiment underscored that while Hilbert curves can offer significant improvements in certain contexts, they did not align with the needs of Fusionfire's workload, highlighting the importance of aligning sorting strategies with data characteristics.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Observability	1	4,496	812	176	+40%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.