A semantic video recommendation engine leveraging TwelveLabs, LanceDB, and Geneva offers a sophisticated approach to understanding and recommending video content by analyzing visuals, audio, and context. Unlike traditional engines reliant on metadata, TwelveLabs provides multimodal embeddings that capture the narrative, mood, and actions within videos. These embeddings are stored in LanceDB, a modern vector database that supports fast vector searches through a Python API. Geneva, built on LanceDB and powered by Ray, facilitates seamless scaling from a single laptop to a distributed cluster. The process involves loading video datasets, generating embeddings with TwelveLabs' Marengo model, and storing them in LanceDB for vector searches. To enhance user experience, TwelveLabs also provides Pegasus, a summarization model that creates concise multimodal summaries. Geneva and Ray automate processes for large-scale deployments, allowing for parallelized embedding generation and distributed storage, making the system scalable and adaptable.