v Tangent Kernels
With Akshunna S. Dogra (Imperial College)
v Tangent Kernels
Machine learning (ML) has been profitably leveraged across a wide variety of problems in recent years. Empirical observations show that ML models from suitable functional spaces are capable of adequately efficient learning across a wide variety of disciplines. In this work (first in a planned sequence of three), we build the foundations for a generic perspective on ML model optimization and generalization dynamics. Specifically, we prove that under variants of gradient descent, “well-initialized” models solve sufficiently well-posed problems at textit{a priori} or textit{in situ} determinable rates. Notably, these results are obtained for a wider class of problems, loss functions, and models than the standard mean squared error and large width regime that is the focus of conventional Neural Tangent Kernel (NTK) analysis. The $nu$ – Tangent Kernel ($nu$TK), a functional analytic object reminiscent of the NTK , emerges naturally as a key object in our analysis and its properties function as the control for learning.
We exemplify the power of our proposed perspective by showing that it applies to diverse practical problems solved using real ML models, such as classification tasks, data/regression fitting, differential equations, shape observable analysis, etc. We end with a small discussion of the numerical evidence, and the role $nu$TKs may play in characterizing the search phase of optimization, which leads to the “well-initialized” models that are the crux of this work.
- Speaker: Akshunna S. Dogra (Imperial College)
- Thursday 14 March 2024, 15:00–16:00
- Venue: Centre for Mathematical Sciences, MR14.
- Series: Applied and Computational Analysis; organiser: Nicolas Boulle.