Data Lakes without Hadoop

Post Details

Company

Starburst

Date Published

May 14, 2018

Author

Shaun Bruno

Word Count

984

Company Posts That Month

1

Language

English

Hacker News Points

-

Post removed?

No

Source URL

www.starburst.io/blog/data-lakes-without-hadoop

Summary

As cloud computing gained traction, it significantly disrupted Hadoop's dominance in data storage by introducing cost-effective and scalable object storage solutions, leading to a shift from Hadoop Distributed File System (HDFS) to cloud-based storage options like Amazon S3, Microsoft Azure Blob Storage, and Google Cloud Storage. While Hadoop's complexity, particularly for analytical tasks, paved the way for technologies like Hive that offered SQL-like functionality, emerging solutions like Starburst have further enhanced data lake analytics by providing a platform-independent query engine conforming to ANSI SQL standards. Starburst allows seamless querying across various data sources, including data lakes, warehouses, and databases, using a Massively Parallel Processing (MPP) architecture for superior performance and cost efficiency. This flexibility enables companies to optimally store and process structured, semi-structured, and unstructured data, as exemplified by Comcast's ability to provide users with access to comprehensive data sets regardless of their on-premises or cloud locations.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.