How WAEs Beat VAEs by 33% Yet Hit Memory Limits

Company

Galileo

Date Published

Nov. 9, 2025

Author

Conor Bronsdon

Word count

1617

Language

English

Hacker News points

None

URL

galileo.ai/blog/wae-research-analysis

Summary

Wasserstein Auto-Encoders (WAEs) have been introduced to bridge the gap between the training stability of Variational Auto-Encoders (VAEs) and the image quality of Generative Adversarial Networks (GANs) by utilizing optimal transport theory. Experiments conducted on the MNIST and CelebA datasets demonstrated that WAEs improve the Fréchet Inception Distance (FID) by 12-33% over VAEs while maintaining stable training, although they face notable challenges in computational complexity, hyperparameter sensitivity, and kernel selection. The WAE-GAN variant offers better sample quality but requires precise tuning, whereas WAE-MMD provides stability akin to VAEs without adversarial training issues. The experiments also highlighted the importance of matching the aggregated posterior to the prior distribution for maintaining sample quality, revealing that even slight mismatches can degrade results. Despite these advancements, the research indicates that real-world deployment of WAEs remains complex due to constraints like O(m²) computational demands, significant hyperparameter adjustments between datasets, and the necessity for appropriate kernel choices. This underscores the need for comprehensive evaluation beyond benchmark metrics to ensure generative models are effective in practical applications.