Announcing TensorRT integration with TensorFlow 1.7

Post Details

Company

Google Cloud

Date Published

March 27, 2018

Author

-

Word Count

994

Company Posts That Month

14

Language

English

Hacker News Points

-

Post removed?

No

Source URL

developers.googleblog.com/announcing-tensorrt-integration-with-tensorflow-17

Summary

Google and NVIDIA have announced the integration of NVIDIA's TensorRT with TensorFlow, aiming to optimize deep learning models for inference on GPUs by enhancing performance and reducing latency. TensorRT introduces FP16 and INT8 optimizations within TensorFlow, allowing for automatic selection of platform-specific kernels to enhance throughput. The integration simplifies workflows by enabling TensorFlow to utilize TensorRT for optimizing compatible sub-graphs, while TensorFlow handles the remaining execution. This approach allows models to be developed with TensorFlow's extensive feature set while benefiting from TensorRT's powerful optimizations. In tests, models like ResNet-50 showed significant performance improvements with this integration. The new TensorFlow API facilitates these optimizations by transforming frozen TensorFlow graphs into TensorRT inference graphs. TensorRT also supports INT8 quantizations, which can speed up computations and reduce memory requirements with minimal accuracy loss, using a calibration process to maintain performance. This integration also leverages NVIDIA Volta GPUs' Tensor Cores for further enhancements in throughput. The release is expected to ensure high performance while maintaining TensorFlow's flexibility and ease of use, with the integration available through the standard pip installation process once TensorFlow 1.7 is released.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.