Pallas for people who know JAX but not kernels yet

Post Details

Company

HuggingFace

Date Published

April 29, 2026

Author

Aritra Roy Gosthipaty

Word Count

1,581

Language

-

Hacker News Points

-

Source URL

huggingface.co/blog/ariG23498/pallas-for-beginners

Summary

Pallas is an experimental extension of JAX designed for writing custom kernels on GPUs and TPUs, allowing users to maintain the Python and JAX primitives they are familiar with while necessitating a deeper understanding of memory allocation at the kernel level. Unlike standard JAX operations, Pallas requires developers to manage memory references directly using Refs, enabling fine-grained control over the computation process. This approach allows for precise memory and tiling management, crucial for optimizing performance on advanced hardware architectures like NVIDIA GPUs and TPUs. Pallas operates by lowering code to Mosaic on TPUs and Mosaic GPU on newer NVIDIA GPUs, with a secondary, less recommended Triton GPU backend. The tool introduces concepts like program instances and grids, essential for efficiently managing parallel computation tasks by defining how many instances to launch and what data blocks each should handle. Debugging and optimizing Pallas kernels involve using interpretation and debugging modes to ensure correct functionality, especially when transitioning from interpreted to compiled modes on TPUs and GPUs.