DNN architectures as random-matrix systems
Reading deep architectures (MLP-Mixer, attention, sparse MLPs) through random-matrix and Kronecker-structure lenses to expose implicit regularization.
Findings (1)
Connections
This topic …
usesFree Random Projection, Random matrix spectra, Asymptotic freeness