Company
Date Published
Author
Will Van Eaton and Abhay Malik
Word count
338
Language
English
Hacker News points
None

Summary

Predibase has introduced several updates aimed at enhancing user experience and performance, including OpenAI API compatible endpoints to facilitate easy migration from OpenAI by merely altering a few lines of code, accessible via Python SDK or REST API. The platform has restructured its free trial offering, now providing a 30-day period with $25 in credits, allowing users to fine-tune models like Llama-2-70b on advanced hardware at no cost. Additionally, a revamped LLM Inference Engine has been optimized to reduce latency by over 100 times, improving response times for serverless and dedicated deployments. A new endpoint, /generate_stream, has been introduced for streaming responses, while enhanced API token management now allows users to manage tokens more effectively by providing options to expire, delete, and monitor token usage. Finally, Predibase has expanded its deployment capabilities with a self-serve workflow for deploying in Microsoft Azure Virtual Private Clouds, complementing its existing AWS support, providing users greater control over their infrastructure and data.