Considerations for Deploying Machine Learning Models in Production

Company

Anyscale

Date Published

Nov. 16, 2021

Author

Jules S. Damji, Michael Galarnyk

Word count

1791

Language

English

Hacker News points

None

URL

www.anyscale.com/blog/considerations-for-deploying-machine-learning-models-in-production

Summary

A common grumble among data science or machine learning researchers is that putting a model in production is difficult, with 87% of models never seeing production due to lack of knowledge on how to deploy them effectively. To address this, it's essential to consider technical considerations and pitfalls when choosing an ML stack and tooling for model deployment. Using your laptop as a development environment can be beneficial, as long as the same code can run with minimal changes in a staging or production environment. Choosing a suitable programming language, such as Python, and an ML framework like PyTorch or TensorFlow, are also crucial considerations. Additionally, using feature stores to manage precomputed and cleansed features is vital for model accuracy. Model serving frameworks should be framework-agnostic, allow business logic, model replication, request batching, high concurrency, and low latency, with deployment CLI and APIs. Furthermore, models in production often operate in four ML patterns: pipeline, ensemble, business logic, and online learning, which require careful consideration when deploying them. Finally, model monitoring is critical to ensure a viable afterlife for the model, as data drifts, concept changes, failures, and system degradation can occur over time. Evaluating tools like Seldon, KFServing, Evidently.ai, Arize.ai, Arthur.ai, Fiddler.ai, Valohai.com, or whylabs.ai to meet these considerations is essential for successful model deployment.