Home / Companies / SingleStore / Blog / Post Details
Content Deep Dive

1.3 Billion NYC Taxi Rows into SingleStore

Blog post from SingleStore

Post Details
Company
Date Published
Author
Seth Luersen
Word Count
3,160
Language
English
Hacker News Points
-
Summary

The NYC taxi data set is a large dataset of yellow taxi trip records from New York City, totaling over 1.3 billion rows. The data has undergone several schema changes over eight years, requiring careful handling and processing to load into a database. SingleStore makes it easy to load the data quickly and efficiently by using its native pipelines feature, which can process compressed files in parallel. The pipelines are designed to handle the large dataset and various file sizes, reducing the time required for loading and improving overall efficiency. Once loaded, the data can be analyzed and queried using geospatial queries, enabling insights into taxi trip patterns and behavior.