Home / Companies / Together AI / Blog / Post Details
Content Deep Dive

FlashConv: Speeding up state space models

Blog post from Together AI

Post Details
Company
Date Published
Author
Dan Fu and Tri Dao
Word Count
1,100
Language
English
Hacker News Points
-
Summary

FlashConv is a technique for speeding up state space models (SSMs) in deep learning, which can run faster than optimized implementations of attention out of the box. SSMs are a promising alternative to attention, scaling nearly-linearly with sequence length instead of quadratic. However, they often run slower due to low FLOP utilization on GPU. FlashConv uses Fast Fourier Transforms (FFTs) and fused FFT convolution to speed up convolutions, achieving speeds comparable to or better than attention at long sequence lengths. The technique also takes advantage of tensor cores on GPUs to further improve performance. By applying these optimizations, SSMs can be used for large-scale language models, enabling faster training and inference times.