OpenTelemetry Best Practices #3: Data Prep and Cleansing

Post Details

Company

Honeycomb

Date Published

June 24, 2024

Author

Martin Thwaites

Word Count

951

Company Posts That Month

4

Language

English

Hacker News Points

-

Post removed?

No

Source URL

www.honeycomb.io/blog/opentelemetry-best-practices-data-prep-cleansing

Summary

Effective observability requires not just collecting telemetry data but ensuring it is curated and useful for gaining insights into production systems. While OpenTelemetry auto-instrumentation can quickly generate large amounts of data, the challenge lies in refining this data to avoid being overwhelmed by irrelevant or sensitive information. This involves using processors like the Transform processor to manipulate data attributes—such as dropping, combining, or hashing attributes to maintain privacy while preserving data utility. Redacting sensitive data is crucial, with processors allowing both passive and aggressive modes to identify and filter out sensitive patterns like Social Security Numbers or credit card information. Maintaining data cardinality while excluding Personally Identifiable Information (PII) is essential to track user interactions without compromising privacy, often achieved through hashing, though this method has limitations. Additionally, filtering out non-useful spans, such as those from health checks, helps streamline data for better observability. Building secure and efficient observability pipelines involves configuring collectors and processors correctly, emphasizing the need for strategic data management practices in telemetry systems.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.