/plushcap/analysis/assemblyai/deep-learning-paper-recap-diffusion-and-transformer-models

Deep Learning Paper Recap - Diffusion and Transformer Models

What's this blog post about?

This week's Deep Learning Paper Reviews discuss two research papers. The first paper applies continuous diffusion models to controllable natural language generation (NLG), improving text generation tasks through innovative "rounding" and "embedding" steps. Results show outperformance of existing methods such as PPLM and FUDGE, but a major bottleneck is the slow decoding speed. The second paper proposes representation pooling to sparsify transformer architectures, achieving sublinear time and memory complexity. An analysis shows a 1.8x speedup during training and 4.5x speedup during inference for long document summarization tasks, but might not be as useful for short input sequences.

Company
AssemblyAI

Date published
Aug. 24, 2022

Author(s)
Dillon Pulliam, Sergio Ramirez Martin

Word count
373

Hacker News points
None found.

Language
English