Faster inference enables up to 5x price reduction on Together API

Post Details

Company

Together AI

Date Published

Aug. 11, 2023

Author

Together

Word Count

379

Language

English

Hacker News Points

-

Source URL

www.together.ai/blog/august-2023-pricing-update

Summary

To optimize inference, Together AI has simplified pricing for its cloud platform, allowing users to process a greater number of transactions per GPU, enabling better cost efficiency. The company has also released additional optimizations to speed up inference even more. With these updates, users can now run more efficient inference with the updated pricing, which includes lower costs and faster performance. This enables users to launch their own inference VMs for models they use, ensuring data privacy, while paying only for requests and an hourly hosting fee when launching their inference VM. The company offers a range of open-source AI models, including RedPajama, Llama 2, Falcon, and more, which can be used with the updated pricing. Together AI aims to provide users with more for less, enabling them to build and run fast AI models efficiently.