XLA - TensorFlow, compiled

Post Details

Company

Google Cloud

Date Published

March 6, 2017

Author

-

Word Count

1,029

Language

English

Hacker News Points

-

Source URL

developers.googleblog.com/xla-tensorflow-compiled

Summary

XLA (Accelerated Linear Algebra) is a compiler developed by Google to enhance the performance of TensorFlow by optimizing data flow graphs through JIT compilation techniques. It analyzes TensorFlow graphs at runtime, specializes them for actual dimensions and types, and fuses multiple operations to produce efficient native machine code for devices like CPUs, GPUs, and custom accelerators such as Google's TPU. XLA retains TensorFlow's flexibility while addressing performance concerns by minimizing kernel launches and memory allocations through efficient operation fusion. It also offers significant executable size reductions for models in restricted-memory environments, such as mobile devices, by leveraging ahead-of-time compilation (AOT) with tools like tfcompile. Although still in its early stages, XLA supports the addition of new device backends with less implementation effort compared to re-implementing all TensorFlow operations, thus paving the way for TensorFlow to run efficiently on a broader range of hardware.