Why Use K-Means for Time Series Data? (Part Two)

Post Details

Company

InfluxData

Date Published

Oct. 2, 2018

Author

Anais Dotis-Georgiou

Word Count

1,338

Company Posts That Month

26

Language

English

Hacker News Points

-

Post removed?

No

Source URL

www.influxdata.com/blog/why-use-k-means-for-time-series-data-part-two

Summary

K-Means is used for anomaly detection in time series data by first windowing the data into segments, then clustering these segments using K-Means. The centroids of the clusters represent different shapes or polynomials that the data takes. By analyzing the shape of each cluster and its position in the 32-dimensional space, it's possible to detect anomalies in the data. However, K-Means has limitations, such as only converging on local minima, which can lead to poor clustering and predictions if initial centroids are placed poorly. Additionally, using the Euclidean distance as a similarity measure can be misleading, especially when dealing with non-uniform time-steps or sensor data.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.