GPT-4.5 Multimodal and Vision Analysis

Post Details

Company

Roboflow

Date Published

Feb. 28, 2025

Author

James Gallagher

Word Count

1,496

Language

English

Hacker News Points

-

Source URL

blog.roboflow.com/gpt-4-5-multimodal

Summary

On February 27, 2025, OpenAI announced the release of GPT-4.5, initially available to Pro users through its web application and API, with plans for broader access. This advanced multimodal language model emphasizes directness and conciseness in its responses, especially when compared to other models like OpenAI's O1 and O3 Mini. Despite its strengths in tasks such as object counting, visual question answering (VQA), and document OCR, GPT-4.5 struggles with object detection and localization, a common challenge for foundation models. Notably more expensive than predecessors, GPT-4.5 costs $75 per one million tokens, compared to $2.50 for GPT-4o, yet it is expected that either future models will become more affordable, or costs will decrease over time. OpenAI describes GPT-4.5 as a step towards integrating reasoning capabilities, an area anticipated to evolve in future model iterations.