Meet Top-K: an Awesome Probabilistic Addition to Redis Features

Company

Redis

Date Published

July 2, 2019

Author

Ariel Shtul

Word count

772

Language

English

Hacker News points

None

URL

redis.io/blog/meet-top-k-awesome-probabilistic-addition-redis

Summary

You may need to consider using probabilistic data structures in your code, depending on the size of your dataset and available memory. Probabilistic data structures can provide high accuracy and fast execution times when dealing with large datasets. The latest implementation of Top-K in Probabilistic uses an algorithm called HeavyKeeper, which provides a count-with-exponential-decay strategy that ensures high accuracy while keeping memory utilization low. This algorithm is biased against small flows, making it suitable for applications where large flows are more common, such as tracking network traffic or game leaderboards. The Top-K data structure also provides real-time notifications when elements enter or leave the top-k list, allowing for timely updates and prevention of denial-of-service attacks. Initializing Top-K requires four parameters, including width, depth, decay, and minimum values, which can be tuned to fine-tune performance. In comparison to Redis' Sorted Set data structure, Top-K provides faster execution times and lower memory usage, especially for lower K values, making it a suitable option for projects with streams or growing datasets that require low memory usage.