The article explores the use of OpenAI's CLIP model for text-to-image and image-to-image searching, and compares the performance of different model formats, including PyTorch, FP16 OpenVINO, and INT8 OpenVINO, in terms of speed improvements. It details the process of preparing a conceptual caption dataset and extracting embeddings using both Hugging Face's PyTorch model and OpenVINO's optimized formats. The results reveal that the FP16 OpenVINO format achieves a 43% reduction in processing time compared to PyTorch, while the INT8 OpenVINO format provides a significant 75.4% reduction, demonstrating a 4.03 times speed increase over the PyTorch model. The article concludes by highlighting the benefits of using OpenVINO for faster embedding extraction in applications like LanceDB, a platform for working with vector search at scale.