Home / Companies / Roboflow / Blog / Post Details
Content Deep Dive

Building a Vehicle Analytics Application with PaliGemma

Blog post from Roboflow

Post Details
Company
Date Published
Author
Nick Herrig
Word Count
1,139
Language
English
Hacker News Points
-
Summary

In June 2024, Nick Herrig detailed the creation of a vehicle analytics application using Google's new Vision Language Model (VLM), PaliGemma, which boasts three billion parameters, commercial use terms, and fine-tuning capabilities. The application is designed to detect vehicles, analyze their make, color, and style, read license plates, and store the information in a CSV file. By employing a fine-tuned Yolov8 model for detection and utilizing tools like Roboflow, ByteTrack, and Supervision, Herrig demonstrates the real-time, self-hosted application on his hardware. PaliGemma simplifies the process by integrating classification and optical character recognition (OCR) tasks, reducing the need for multiple models. The application operates efficiently with an Nvidia RTX 3090, showcasing the potential of VLMs for both enterprise and hobbyist computer vision projects. Herrig emphasizes the role of Roboflow's tools and anticipates continued advancements in VLMs and hardware, expanding the possibilities for innovative applications.