Text-to-image search with Vespa

Post Details

Company

Vespa

Date Published

Nov. 28, 2021

Author

Lester Solbakken

Word Count

2,157

Company Posts That Month

4

Language

English

Hacker News Points

-

Source URL

blog.vespa.ai/text-image-search

Summary

Text-to-image search has evolved significantly with the advent of machine learning, transitioning from reliance on textual labels to leveraging models like OpenAI's CLIP, which understands both text and image content. The CLIP model, trained on 400 million image-text pairs, enables zero-shot learning, allowing it to classify images with labels not seen during training. This model uses two sub-models for text and images, generating vectors that are compared using cosine distance to find matches. Vespa, a platform equipped with capabilities like approximate nearest neighbor search and machine-learned model inference, is used to build a text-to-image search application that indexes and retrieves images based on user-provided textual descriptions. The sample application demonstrates how CLIP facilitates efficient and accurate image retrieval and can be applied to any image collection, offering a robust baseline for further fine-tuning.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
RAG	2	15	9	3	+200%
Vector Search	2	178	35	26	+117%
AI Model Fine-tuning	1	No monthly metrics for this publish month.
LLM	1	115	29	16	+130%