Home / Companies / Onehouse / Blog / Post Details
Content Deep Dive

Introducing Onehouse Notebooks – Interactive PySpark at 4x Price-Performance

Blog post from Onehouse

Post Details
Company
Date Published
Author
Andy Walner, Divik Mittal and Praveen Gajulapalli
Word Count
450
Language
English
Hacker News Points
-
Summary

Onehouse Notebooks is a new PySpark Jupyter notebook experience powered by the Onehouse Quanton engine, offering improved price-performance compared to other Apache Spark platforms. It allows users to run interactive PySpark workloads with the benefit of autoscaling clusters within a virtual private cloud, which helps manage costs and infrastructure flexibility. Fully compatible with Apache Spark, Onehouse Notebooks enables users to transition existing PySpark files seamlessly, and the integration with the broader Onehouse platform ensures automatic table optimization and synchronization with any catalog using OneSync. The platform is designed for iterative data engineering, offering the ability to test and validate code cell-by-cell, making it ideal for tasks such as exploring new datasets, prototyping transformations, debugging, and conducting ad-hoc analysis. Users can develop interactively within notebooks and, once ready, operationalize their logic as Apache Spark jobs for production pipelines. To get started, users create a notebook cluster through the Onehouse console, where they can control costs and access their Jupyter notebooks pre-configured with PySpark, facilitating high-performance data lakehouse construction with enhanced cost efficiency.