Company
Date Published
Author
Jim Lai
Word count
1364
Language
-
Hacker News points
None

Summary

DeLERP, or Decomposed Linear Interpolation, offers a novel approach to model merging by independently handling the direction and magnitude of neural network weights, addressing limitations of traditional linear interpolation (LERP) and spherical linear interpolation (SLERP). While LERP can weaken a model's representational capacity due to a "norm dip," SLERP, although correcting this, is computationally expensive and privileges the zero vector unnecessarily. DeLERP, inspired by the work of Zheng et al., uses Normalized Linear Interpolation (NLERP) for direction and a max norm strategy for magnitude, ensuring smooth directional transitions without arbitrary geometric constraints. This method preserves the stronger importance signal from either model, maintaining representational capacity with minimal computational overhead. Tested with mergekit, DeLERP demonstrated improvements in capability metrics and alignment when used to merge models, suggesting that maintaining representational capacity can enhance cognitive performance and safety features. While it doesn't directly solve statistical issues like variance collapse, DeLERP's method of preserving magnitude and using geometric direction interpolation may help mitigate such challenges.