Announcing Pyro Caml: A Continuous Profiler for OCaml
Blog post from Semgrep
Semgrep's core SAST engine is built in OCaml, a language with a limited ecosystem that lacks essential libraries for observability, crucial for its operations on numerous code repositories. To bridge this gap, the Semgrep team developed Pyro Caml, a continuous profiler tailored for OCaml, which they recently released as version 1.0.0. Continuous profiling is distinct from traditional profiling as it operates in production, constantly analyzing and reporting program performance data to a centralized system. This approach is vital for Semgrep, especially given its security constraints and the need to avoid handling user code directly. Existing profilers like ocamlprof and tools dependent on Linux's perf_event_open were incompatible due to Semgrep's use of gVisor, which led to the creation of Pyro Caml using the Pyroscope SDK. Pyro Caml leverages OCaml's Memprof for call stack sampling, ensuring low overhead while providing insights into bottlenecks within the codebase. The profiler's development involved overcoming challenges related to sampling accuracy and runtime impact, and it is now a critical tool in Semgrep's performance optimization efforts, having been deployed across millions of scans. Despite some limitations, such as missing samples in certain contexts and challenges with recursive calls, Pyro Caml has proven effective, and future enhancements are anticipated to address these issues and expand its capabilities.