Vision Deep Learning
In computer vision, deep learning has proven useful to extract patterns from images. Deep learning uses a neural network and optimization to relate features (pixels) to a desired label. As opposed to Cascade Classifiers, deep learning does not need specialized preprocessing of the image to develop applicationspecific features. The pixels from the image are processed through multiple linear and nonlinear layers to predict an output. Deep learning generally requires many thousands of labeled examples to learn. A Convolutional Neural Network (CNN) transforms the input image with a specialized connectivity structure. It stacks multiple stages of feature extractors. The higher stages compute more global, invariant features with a classification layer at the end. Feedforward feature extraction convolves input with learned filters, transforms with nonlinearity (sigmoid, hyperbolic tangent, rectified linear units), performs spatial pooling, and finally normalizes to create a feature map. With convolution the dep
|
|