Scene understanding based on Multi-Scale Pooling of deep learning features
DongYang Li, Yue Zhou
Available Online April 2015.
- https://doi.org/10.2991/amcce-15.2015.308How to use a DOI?
- CNNs; MOP-CNN; SPP-net; Scenes understanding
- Deep convolutional neural networks (CNNs) have recently shown impressive performance as generic representation for recognition. However, the feature extracted f¬rom global CNNs lack geometric invariance, which limits their robustness for classification and detection of highly variable objects .To improve the invariance of the features w¬ithout degrading their discriminative power and speed up the calculation, we follow t¬he next two method. Firstly, we adopt the scheme called multi-scale orderless pooling (MOP-CNN) which extracts CNNs activation from local patches of the image at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result. Second, to speed up the calculation, we adapt the SPP-net as the CNNs architecture. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fi¬xed-length representations for training the detectors. This method avoids repeatedly computing the convolu-tional features. On the challenging SUN397 Scenes classification datasets, our method achieves competitive classification results.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - DongYang Li AU - Yue Zhou PY - 2015/04 DA - 2015/04 TI - Scene understanding based on Multi-Scale Pooling of deep learning features BT - 2015 International Conference on Automation, Mechanical Control and Computational Engineering PB - Atlantis Press SN - 1951-6851 UR - https://doi.org/10.2991/amcce-15.2015.308 DO - https://doi.org/10.2991/amcce-15.2015.308 ID - Li2015/04 ER -