Yonatan Kahn, University of Toronto: Finite-width feature learning in deep networks with orthogonal weight initialization
Översikt
- Datum:Startar 3 april 2025, 15:30Slutar 3 april 2025, 16:30
- Plats:MV:L13, Chalmers tvärgata 3
- Språk:Engelska
Abstrakt finns enbart på engelska: Fully-connected deep neural networks with weights initialized from independent Gaussian distributions can be tuned to criticality, which prevents the exponential growth or decay of signals propagating through the network. However, such networks still exhibit fluctuations that grow linearly with the depth of the network, which may impair the training of networks with width comparable to depth. I will provide some theoretical and experimental evidence that using weights initialized from the ensemble of orthogonal matrices leads to better training and generalization behavior even for deep networks, and argue that these results demonstrate the practical usefulness of finite-width perturbation theory.