Machine learning
subtree 7 descendants 6 findings 1 note
DNN theory through random-matrix lenses — free random projection, dynamical isometry, meta-RL, training dynamics.
Members (7)
- themeDynamical isometry
Conditions under which signal propagation in deep networks preserves norms and gradients; spectral analysis of layerwise Jacobians and Fisher information.
Featured: The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry
- methodOrthogonal initialization
Initializing weight matrices as random orthogonal matrices to preserve singular values.
- themeDNN architectures as random-matrix systems
Reading deep architectures (MLP-Mixer, attention, sparse MLPs) through random-matrix and Kronecker-structure lenses to expose implicit regularization.
Featured: Understanding MLP-Mixer as a Wide and Sparse MLP
- methodFree Random Projection
Random representation-based projection method for in-context and meta-reinforcement learning.
Featured: Free Random Projection for In-Context Reinforcement Learning
- themeMeta reinforcement learning
Learning algorithms that adapt to new tasks from limited interaction.
Featured: Free Random Projection for In-Context Reinforcement Learning
- themeReinforcement learning
Sequential decision-making under uncertainty — the umbrella over meta-RL adaptive learning and the VR-scene exploration policies in the adjacent thread.
- themeInterpretability and training dynamics
Layer-wise interpretability via identity initialization, implicit bias of gradient regularization, and selective forgetting / unlearning.
Featured: Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
Findings — subtree (6)
Papers (6)
- Free Random Projection for In-Context Reinforcement Learning
- Understanding MLP-Mixer as a Wide and Sparse MLP
- Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
- Layer-Wise Interpretation of Deep Neural Networks Using Idneity Initialization
- The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry
- Selective Forgetting of Deep Networks at a Finer Level than Samples
Notes — subtree (1)
Connections
No topic connections yet.