/plushcap/analysis/assemblyai/deep-learning-paper-recap-language-models

Deep Learning Paper Recap - Language Models

What's this blog post about?

The paper "Prune Once For All: Sparse Pre-Trained Language Models" introduces an architecture-agnostic method of training sparse pre-trained language models, allowing for pruning only during the pre-training phase. This technique results in better compression-to-accuracy ratios and eliminates the need to reconsider the model's architecture or task when applying pruning techniques during fine-tuning. The best scores were achieved with 85% and 90% weight pruning, while Quantized Aware Training (QAT) with 85% pruning led to an even more accurate and smaller model.

Company
AssemblyAI

Date published
July 7, 2022

Author(s)
Taufiquzzaman Peyash

Word count
273

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.