Home / Companies / Tinybird / Blog / Post Details
Content Deep Dive

How to extract the protocol of a URL in ClickHouse ®

Blog post from Tinybird

Post Details
Company
Date Published
Author
Cameron Archer
Word Count
2,581
Language
English
Hacker News Points
-
Summary

The article provides an in-depth exploration of the `protocol()` function in ClickHouse, which is designed to extract the scheme from URL strings, returning values such as `https`, `http`, or `ftp`, and an empty string for malformed inputs. It details the function's syntax, its application in real-time API development for analyzing URL protocols, and performance optimization strategies for handling large datasets, such as using LowCardinality data types and materialized views. Additionally, the text discusses integrating ClickHouse with Tinybird for building web security analytics APIs, enabling efficient protocol-based security analysis with minimal infrastructure management. The article outlines how managed services like Tinybird simplify the operational complexities of deploying ClickHouse infrastructure, offering SQL-based transformations and production-grade APIs. It concludes with practical insights into indexing and updating protocol data and highlights resources for further exploration of URL functions in ClickHouse.