From Classical Statistics to Modern Deep Learning in Russian
Speaker: Prof. Mikhail (Misha) Belkin, University of California, San Diego Recent empirical successes of deep learning have exposed significant gaps in our fundamental understanding of learning and optimization mechanisms. Modern best practices for model selection are in direct contradiction to the methodologies suggested by classical analyses. Similarly, the efficiency of SGDbased local methods used in training modern models, appeared at odds with the standard intuitions on optimization. I will present evidence, empirical and mathematical, that necessitates revisiting classical statistical notions, such as overfitting. I will continue to discuss the emerging understanding of generalization, and, in particular, the double descent risk curve, which extends the classical Ushaped generalization curve beyond the point of interpolation. While our understanding has significantly grown in the last few years, a key piece of the puzzle remains how does optimization align with statistics to form the complete
|
|