The basics of Vespa applications
Blog post from Vespa
Vespa is a platform designed for handling distributed computation over large datasets in real-time, often referred to as big data serving, and it simplifies the process of building production-ready applications. It achieves this by encapsulating the necessary configuration, components, and models within an application package, which specifies search, service, and host definitions. These packages define the structure of data as documents with various types and fields, dictate how queries are processed, and establish the ranking models for determining document relevance. Data is managed through a process called feeding, using Vespa's JSON document format, and document processors can form pipelines for data enrichment. Queries can be executed with an SQL-like language called YQL, and Vespa supports advanced ranking models that compute expressions over document features, including tensors for complex models like deep neural networks. Additionally, Vespa allows for grouping and aggregation of data, providing capabilities to organize and analyze results beyond basic querying. The platform's deployment tools enable configuration changes without disrupting service, facilitating scalable and efficient data processing applications.