Home / Companies / InfluxData / Blog / Post Details
Content Deep Dive

Why Use K-Means for Time Series Data? (Part Two)

Blog post from InfluxData

Post Details
Company
Date Published
Author
Anais Dotis-Georgiou
Word Count
1,338
Language
English
Hacker News Points
-
Summary

K-Means is used for anomaly detection in time series data by first windowing the data into segments, then clustering these segments using K-Means. The centroids of the clusters represent different shapes or polynomials that the data takes. By analyzing the shape of each cluster and its position in the 32-dimensional space, it's possible to detect anomalies in the data. However, K-Means has limitations, such as only converging on local minima, which can lead to poor clustering and predictions if initial centroids are placed poorly. Additionally, using the Euclidean distance as a similarity measure can be misleading, especially when dealing with non-uniform time-steps or sensor data.