TensorFlow Benchmarks and a New High-Performance Guide
Blog post from Google Cloud
In May 2017, the TensorFlow team released a collection of performance benchmarks demonstrating the speed and scalability of TensorFlow when training image classification models such as InceptionV3 and ResNet on various hardware configurations. These benchmarks, including tests on NVIDIA's DGX-1 and Tesla K80 GPUs in both single-server and distributed setups, showed near-linear performance scaling and significant speedups, particularly when using synthetic data. The team also introduced a High-Performance Models guide to assist developers in optimizing their models for different platforms. The benchmarks highlighted the importance of tailoring model configurations to specific hardware setups, as optimal performance varied based on the platform, with some models performing better when variables were handled differently across GPUs. The team emphasized the need for comprehensive benchmarking to achieve optimal performance and discussed plans for future tests focusing on convergence time to high accuracy levels. They acknowledged NVIDIA's support in providing hardware and technical assistance, expressing anticipation for future collaborations to enhance TensorFlow's performance, especially with NVIDIA's forthcoming Volta architecture.