The Challenges in Building an AI Inference Engine for Real-Time Applications

Company

Redis

Date Published

April 17, 2020

Author

Yiftach Shoolman

Word count

1414

Language

English

Hacker News points

None

URL

redis.io/blog/the-challenges-in-building-an-ai-inference-engine-for-real-time-applications

Summary

The artificial intelligence (AI) industry has experienced significant growth since 2016, driven by advancements in GPU technology and the need for faster model training. The focus has shifted towards deploying AI models to production and managing the entire AI lifecycle. A critical step in this process is AI serving, which involves deploying a task usually performed by an AI inference engine. To achieve fast end-to-end inferencing/serving, several challenges must be addressed, including optimizing AI processing, running the AI inference platform where data lives, and using a purpose-built serverless platform. By overcoming these challenges, businesses can benefit from running AI on dedicated inference chipsets and ensure a seamless user experience despite the potential slowness in the AI inference engine.