Home / Companies / Google Cloud / Blog / Post Details
Content Deep Dive

Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Blog post from Google Cloud

Post Details
Company
Date Published
Author
Logan Kilpatrick, and Shrestha Basu Mallick
Word Count
823
Language
English
Hacker News Points
-
Summary

Google has released updated versions of its Gemini models, namely Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, offering improved performance and significant cost reductions. These updates include over 50% price reductions for input and output tokens under 128K, increased rate limits, faster output, and reduced latency, making them more efficient and cost-effective for various tasks such as synthesizing information from extensive PDFs, answering complex code-related questions, and video content creation. The models now exhibit superior performance in math, long context, and vision-related tasks, with improvements noted in benchmarks like MMLU-Pro, MATH, and HiddenMath, and enhancements in visual understanding and Python code generation. The updated models respond more efficiently, with shorter default outputs for tasks like summarization and question answering, while maintaining content safety standards. Google has also announced a new experimental version, Gemini-1.5-Flash-8B-Exp-0924, which promises enhanced performance across various use cases and is accessible via Google AI Studio and the Gemini API. These advancements reflect Google's commitment to incorporating developer feedback and optimizing its experimental-to-production release pipeline.