Company
Date Published
Author
Baseten
Word count
497
Language
English
Hacker News points
None

Summary

This week, we overhauled the model management experience on Baseten, improving several core workflows to clarify the model lifecycle. We also shipped a new text embedding model that matches OpenAI's ada-002 in both context window size and benchmark performance. The new model management experience features improved deployability, scalability, and cost-effectiveness, including workspace API keys with more granular permissions and separate measurements for end-to-end response time and inference time. Additionally, the team at Baseten has refreshed their documentation with guides on various topics such as model inference, autoscaling, monitoring, and instance types. The company is also hosting a fireside panel on the state of open source ML in San Francisco and will be attending AWS re:Invent in NVIDIA's generative AI pavilion.