Video: Scoring Machine Learning Models at Scale

Post Details

Company

SingleStore

Date Published

May 3, 2017

Author

Mason Hooten

Word Count

267

Language

English

Hacker News Points

-

Source URL

www.singlestore.com/blog/video-scoring-machine-learning-models-at-scale

Summary

At Strata+Hadoop World, SingleStore Software Engineer John Bowler shared two ways of making production data pipelines in SingleStore, one using Spark for general purpose computation through a transform defined in SingleStore pipeline. He ran a live demonstration of SingleStore and Apache Spark for entity resolution and fraud detection across a large dataset, leveraging SingleStore's native geospatial capabilities to reduce network overhead. John used SingleStore Pipelines and TensorFlow to write a machine learning Python script that accurately identified handwritten numbers after training the model in seconds, showcasing the performance benefits of combining SingleStore with popular open-source libraries like Duke for entity resolution. The presentation provided a 79-page guide on designing, building, and deploying Spark applications using the SingleStore Spark Connector, along with code samples and performance recommendations for production-ready Apache Spark and SingleStore implementations.