Home / Companies / QuestDB / Blog / Post Details
Content Deep Dive

The story of our SAMPLE BY enhancements

Blog post from QuestDB

Post Details
Company
Date Published
Author
Nick Woolmer
Word Count
2,082
Language
English
Hacker News Points
-
Summary

QuestDB, an open-source time-series database known for its ultra-low latency and high ingestion throughput, encountered an unexpected result when executing a query meant to downsample NYC Taxi dataset trips from 2018, revealing a potential bug in the SAMPLE BY code. The issue arose due to the default calendar alignment in QuestDB's sampling, which floors the timestamp to the nearest unit, sometimes resulting in misaligned buckets. This led to timestamps beginning in 2017 instead of 2018, as expected. Through an investigation involving query explanations and optimizations, it was understood that the flooring mechanism used a fixed origin, causing misalignment in the absence of an appropriate offset origin. To address this, QuestDB introduced new syntax options such as the FROM-TO clause, allowing users to better define output data shapes and intervals, enhancing control over sampling processes and enabling the filling of missing data with specified values. This development aims to provide more flexibility and precision in handling time-series data, particularly for queries with complex conditions or those that lack explicit WHERE clauses. The ongoing enhancements reflect QuestDB's commitment to improving the functionality and user experience of its time-series database capabilities.