Home / Companies / Cloudflare / Blog / Post Details
Content Deep Dive

A History of HTML Parsing at Cloudflare: Part 2

Blog post from Cloudflare

Post Details
Company
Date Published
Author
Andrew Galloni, Ivan Nikulin
Word Count
3,142
Language
English
Hacker News Points
-
Summary

In 2017, developers using the Cloudflare edge compute platform Workers wanted HTML rewriting capabilities similar to those used internally by Cloudflare. To meet this demand, a streaming HTML rewriter/parser with a CSS-selector based API was built in Rust and open-sourced as LOL HTML. The major change compared to the previous rewriter, LazyHTML, is the dual-parser architecture required to overcome the additional performance overhead of wrapping/unwrapping each token when propagating tokens to the Workers runtime. This new approach significantly speeds up parsing and reduces output latency and memory consumption.