Route specialized workloads
Blog post from Temporal
Modern applications face challenges with diverse resource requirements for workloads such as ML/AI, video processing, and data analytics, necessitating the use of specialized hardware like GPUs, encoding hardware, and memory-optimized instances. Running all activities on the same type of worker is inefficient and costly, leading to resource wastage, cost inefficiency, environment conflicts, and scaling complexity. A proposed solution involves using separate Task Queues to route activities based on their specific resource needs, creating dedicated worker pools for GPU-intensive, CPU-intensive, high-memory, and specialized hardware workloads. This approach optimizes costs by ensuring GPU instances handle only necessary ML tasks, while standard activities are run on cost-effective CPU instances, potentially reducing infrastructure costs by 60-80%. The strategy maximizes resource efficiency by allowing GPU workers to handle limited concurrent activities while CPU workers manage much higher concurrency, enabling independent scaling based on workload demand. Environment isolation is achieved by separating ML dependencies from standard worker activities, simplifying deployments and preventing library conflicts. This implementation, targeting Temporal Workflow & Activity developers, platform operators, ML/AI engineers, DevOps, and FinOps teams, requires specific infrastructure, including GPU instances and container orchestration, to ensure efficient and cost-effective resource allocation.