Why Arrays as a Universal Data Model
Blog post from TileDB
Multi-dimensional arrays are proposed as a universal data model capable of efficiently capturing and processing diverse data types and applications, addressing the need for a single database system that can accommodate everything from tables and images to genomics and LiDAR data. The blog post argues for arrays' suitability due to their performance advantages, flexibility, and ability to model both dense and sparse data, which traditional tabular formats cannot achieve. The post emphasizes the importance of a robust storage engine to support the array model and on-disk format, highlighting TileDB's design as an example of a powerful engine that can handle the complexities of building a universal database. The author, Stavros Papadopoulos, advocates for focusing on storage engines rather than format specifications, suggesting that this approach allows for rapid evolution and better integration with various computational frameworks.