Deployment Shapes: One-Click Deployment Configured For You

Post Details

Company

Fireworks AI

Date Published

Oct. 24, 2025

Author

-

Word Count

875

Language

English

Hacker News Points

-

Source URL

fireworks.ai/blog/deployment-shapes

Summary

Fireworks has introduced Deployment Shapes to streamline the configuration of serving setups for developers using large language models (LLMs). These pre-configured templates are designed to optimize deployments for latency, throughput, or cost, balancing the other factors to suit different use cases. Users can start with serverless deployments, which are easy to use but may not be optimal for high-volume needs, or opt for on-demand deployments that offer single-tenant, customizable configurations. Fireworks' advanced techniques, such as speculative decoding and caching, enhance inference speed and efficiency, while ongoing improvements in GPU kernels and configurations ensure cutting-edge performance. Deployment Shapes are now available via both the Fireworks website and CLI, and the company offers additional customization support for enterprise customers seeking further optimization.