Home / Companies / MongoDB / Blog / Post Details
Content Deep Dive

Getting started with MongoDB, PySpark, and Jupyter Notebook

Blog post from MongoDB

Post Details
Company
Date Published
Author
Robert Walters
Word Count
1,264
Language
English
Hacker News Points
-
Summary

A JupyterLab notebook was created to leverage MongoDB data in conjunction with PySpark, an open-source general-purpose cluster-computing framework that efficiently processes large-scale data. The notebook loaded financial security data from MongoDB using the MongoDB Spark Connector and PySpark, calculated a moving average based on the price of the stock security, and updated the data in MongoDB with the new calculation. The environment was set up to include a MongoDB cluster, an Apache Spark deployment, and JupyterLab, allowing for seamless integration and ad-hoc queries. The example demonstrates how easy it is to integrate MongoDB data within a Spark data science application, showcasing the capabilities of the MongoDB Connector for Spark.