Make your videos queryable using foundation models

Post Details

Company

LabelBox

Date Published

March 21, 2023

Author

Manu Sharma

Word Count

1,388

Language

-

Hacker News Points

-

Source URL

labelbox.com/blog/make-videos-queryable-using-foundation-models

Summary

The tutorial demonstrates how to enrich video content using foundation models from OpenAI, Meta, and Hugging Face to perform tasks such as video search, content understanding, and metadata generation. By utilizing Labelbox Catalog as a data platform, the tutorial explores the use of OpenAI's Whisper for transcription, GPT-3.5 for summarization, and the Generation 2 embeddings for similarity search, alongside Meta's TimeSformer for video classification, to generate and manage video metadata. The process involves preparing data from the QUERYD dataset, selecting appropriate AI models, generating metadata and embeddings, and exploring results through various search techniques. The tutorial highlights the advantages of using these models for tasks like zero-shot classification and similarity search to refine and enhance video search capabilities and accelerate workflows, offering practical examples such as identifying cooking videos from a diverse dataset.