Quick Tips: regex filter buckets

Post Details

Company

Elastic

Date Published

July 8, 2014

Author

Zachary Tong

Word Count

725

Language

English

Hacker News Points

-

Source URL

www.elastic.co/blog/quick-tips-regex-filter-buckets

Summary

The article by Zachary Tong discusses strategies for categorizing irregular product codes using Elasticsearch, particularly when aggregating data from multiple legacy systems. It outlines two primary methods to address inconsistencies: pre-parsing data with Grok filters in Logstash at the input level, and using regular expressions with filter buckets on existing data. Pre-parsing involves tagging data according to recognized patterns, which facilitates subsequent aggregations, while the regex approach is suitable for already indexed data, allowing users to identify product codes by patterns without re-indexing. The piece highlights the use of filter buckets to sort and analyze data based on specific criteria, emphasizing the performance benefits of filters and introducing the "filters bucket" feature in Elasticsearch version 1.3.0, which simplifies applying multiple filters simultaneously. The overall message encourages leveraging these techniques to manage and derive statistics from irregular data efficiently.