Home / Companies / Zapier / Blog / Post Details
Content Deep Dive

What is mixture of experts (MoE)?

Blog post from Zapier

Post Details
Company
Date Published
Author
Harry Guinness
Word Count
1,440
Language
English
Hacker News Points
-
Summary

Mixture of Experts (MoE) is a machine learning architecture where a model is composed of multiple specialized sub-models, or "experts," along with a routing network that determines which expert to activate for a given input. This approach contrasts with dense models that activate all parameters for every input, potentially leading to inefficiencies. MoE models, used increasingly in large language models like Llama and DeepSeek, offer more efficient inference by activating fewer parameters, though they require substantial computational resources for training due to their complexity. Despite the training challenges, MoE models can be more cost-effective to run than dense models, especially for large-scale applications, as they balance performance with reduced computational costs during inference. However, their requirement for significant memory resources makes them less suitable for running on low-powered local devices. As AI technology advances, it is expected that both open and proprietary models will increasingly adopt MoE architectures, given their promising potential for efficiency and power.