Company
Date Published
Author
Together
Word count
379
Language
English
Hacker News points
None

Summary

To optimize inference, Together AI has simplified pricing for its cloud platform, allowing users to process a greater number of transactions per GPU, enabling better cost efficiency. The company has also released additional optimizations to speed up inference even more. With these updates, users can now run more efficient inference with the updated pricing, which includes lower costs and faster performance. This enables users to launch their own inference VMs for models they use, ensuring data privacy, while paying only for requests and an hourly hosting fee when launching their inference VM. The company offers a range of open-source AI models, including RedPajama, Llama 2, Falcon, and more, which can be used with the updated pricing. Together AI aims to provide users with more for less, enabling them to build and run fast AI models efficiently.