ResNet9: train to 94% CIFAR10 accuracy in 100 seconds with a single Turing GPU

Post Details

Company

Lambda

Date Published

Jan. 7, 2019

Author

Chuan Li

Word Count

668

Language

English

Hacker News Points

-

Source URL

lambda.ai/blog/resnet9-train-to-94-cifar10-accuracy-in-100-seconds

Summary

The ResNet9 model has achieved a significant speedup in training on CIFAR10, with an accuracy of 94% in just 75 seconds using a single V100 GPU. This is a substantial improvement over the previous winning entry from FastAI, which required 8x more GPUs and took nearly twice as long to train. The model's performance was achieved through a series of modifications, including removing unnecessary layers, optimizing batch size and random number generation, and using single-precision for batch norm. The most significant improvement came from optimizing the residual network architecture, with an estimated 27% reduction in training time. Despite this impressive achievement, there is still room for further improvement, with potential speedups of up to 2X possible if compute efficiency can be realized.