Company
Date Published
Author
Philip Kiely
Word count
597
Language
English
Hacker News points
None

Summary

When deploying a packaged machine learning model to a cloud service like AWS, choosing the right instance size is crucial to handle your model while minimizing compute cost. This decision involves two key factors: CPU or GPU and memory size. Models can be served on either a CPU or a GPU, with GPUs being more powerful but also more expensive. If your model can run on a GPU and invocation speed matters, select an instance with an attached GPU. Otherwise, stick with less expensive CPU instances. The second decision is selecting the appropriate memory size, which should be based on the size of your model weights files and other necessary files to have loaded into memory. By considering these two factors, you can select the most suitable instance type for your model deployment while minimizing costs.