Home / Companies / Starburst / Blog / Post Details
Content Deep Dive

Data Lakes without Hadoop

Blog post from Starburst

Post Details
Company
Date Published
Author
Shaun Bruno
Word Count
984
Language
English
Hacker News Points
-
Summary

As cloud computing gained traction, it significantly disrupted Hadoop's dominance in data storage by introducing cost-effective and scalable object storage solutions, leading to a shift from Hadoop Distributed File System (HDFS) to cloud-based storage options like Amazon S3, Microsoft Azure Blob Storage, and Google Cloud Storage. While Hadoop's complexity, particularly for analytical tasks, paved the way for technologies like Hive that offered SQL-like functionality, emerging solutions like Starburst have further enhanced data lake analytics by providing a platform-independent query engine conforming to ANSI SQL standards. Starburst allows seamless querying across various data sources, including data lakes, warehouses, and databases, using a Massively Parallel Processing (MPP) architecture for superior performance and cost efficiency. This flexibility enables companies to optimally store and process structured, semi-structured, and unstructured data, as exemplified by Comcast's ability to provide users with access to comprehensive data sets regardless of their on-premises or cloud locations.