
Fine-grained images have similar global structure but exhibit variant local appearance. Bilinear pooling models have been proven to be effective in modeling different semantic parts and capturing the effective feature learning for fine-grained image classification. However, the bilinear models do not consider that convolutional neural networks (CNNs) may lose important semantic information during forward propagation, and feature interactions of different convolutional layers enhance feature learning which improves classification performance. Therefore, we propose a multi-layer weight-aware bilinear pooling method to model cross-layer object parts feature interaction as the feature representation, and different weights are assigned to each convolutional layer to adaptively adjust the outputs of the convolutional layers to highlight more discriminative features. The proposed method results in great performance improvement compared with previous state-of-the-art approaches. We demonstrate the effectiveness of our method on the CUB-200-2011 and FGVC-Aircraft datasets.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
