Machine Learning with Python, Jupyter, KSQL and TensorFlow

Company

Confluent

Date Published

Feb. 6, 2019

Author

Lucia Cerchie, Kai Waehner, Josep Prat

Word count

2338

Language

English

Hacker News points

None

URL

www.confluent.io/blog/machine-learning-with-python-jupyter-ksql-tensorflow

Summary

Building a scalable and reliable machine learning infrastructure is a complex task that extends beyond simply creating analytic models with Python. The blog post discusses the challenges of integrating various components, highlighting Uber's Michelangelo platform, which initially relied on Apache Spark and Java but later expanded to support Python models and frameworks like PyTorch and TensorFlow. The Apache Kafka ecosystem is presented as a solution to the impedance mismatch often faced between data scientists, data engineers, and production engineers, offering a scalable and reliable system for data ingestion, processing, and model deployment. By integrating Kafka with tools like KSQL and Python environments such as Jupyter Notebooks, data scientists can perform interactive data analysis and preprocessing while benefiting from Kafka's scalability and reliability for production deployment. The post emphasizes the importance of resolving these integration challenges to unlock real business value from machine learning projects and suggests that Kafka and Python are complementary technologies that can be effectively combined to support various machine learning workflows.