Company
Date Published
Author
Michael McCandless
Word count
1522
Language
English
Hacker News points
None

Summary

Apache Lucene 6.0 introduces a new feature called dimensional points, utilizing the k-d tree geo-spatial data structure for efficient single- and multi-dimensional numeric range and geo-spatial point-in-shape filtering. This feature replaces deprecated numeric fields and offers improved performance and versatility, supporting up to 8 dimensions and 16 bytes per dimension. The block k-d tree variant is designed for efficient I/O, storing most of its data structure in on-disk blocks with a small in-heap binary tree to locate these blocks during searches. At index time, it recursively partitions N-dimensional points into smaller cells, and at search time, it efficiently tests query shapes against these cells. Dimensional points promise significant improvements in index size and search time efficiency compared to legacy systems, and although currently limited to single points, future enhancements may include indexing shapes with R-Trees. While not yet officially released, this feature offers exciting possibilities for combining geo-spatial and other dimensional data for advanced querying and filtering.