Detecting and Fixing ‘Dead Neurons’ in Foundation Models

Post Details

Company

Neptune.ai

Date Published

Oct. 28, 2025

Author

Michał Oleszak

Word Count

3,380

Language

English

Hacker News Points

-

Source URL

neptune.ai/blog/detecting-and-fixing-dead-neurons-in-foundation-models

Summary

Dead neurons in foundation models, characterized by consistently near-zero activations, can significantly degrade model capacity and efficiency by wasting computational resources and reducing the diversity of learned features. This issue, while not new, has gained prominence with the rise of large foundation models where a substantial portion of neurons can remain inactive, as shown in studies where models like BERT, XLNet, and OPT have exhibited large fractions of dead neurons. Detecting and addressing dead neurons is crucial for optimizing model performance and resource usage, and can be achieved through visualization techniques such as activation frequency histograms and heatmaps. Strategies to prevent and fix dead neurons include selecting appropriate activation functions, such as GELU or Swish, which are less prone to neuron inactivity, and employing methods like synaptic stripping, which revives inactive neurons by pruning problematic connections. Monitoring neuron health should be integral to the training and evaluation processes of foundation models, enabling improved generalization and reduced computational waste.