Deep Networks and the Multiple Manifold Problem ( John Wright, Columbia U. ), Day 2 Talk 3
We study the multiple manifold problems, a binary classification task modeled on applications in machine vision, in which a deep fullyconnected neural network is trained to separate two lowdimensional submanifolds of the unit sphere. We provide an analysis of the onedimensional case, proving for a simple manifold configuration that when the network depth L is large relative to certain geometric and statistical properties of the data, the network width n grows as a sufficiently large polynomial in L, and the number of samples from the manifolds are polynomial in L, randomlyinitialized gradient descent rapidly learns to classify the two manifolds perfectly with high probability. Our analysis demonstrates concrete benefits of depth and width in the context of a practicallymotivated model problem: the depth acts as a fitting resource, with larger depths corresponding to smoother networks that can more readily separate the class manifolds, and the width acts as a statistical resource, enabling concentr
|
|