Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

MHA-WoML: Multi-head attention and Wasserstein-OT for few-shot learning

MHA-WoML: Multi-head attention and Wasserstein-OT for few-shot learning Few-shot learning aims to classify novel classes with extreme few labeled samples. Existing metric-learning-based approaches tend to employ the off-the-shelf CNN models for feature extraction, and conventional clustering algorithms for feature matching. These methods neglect the importance of image regions and might trap in over-fitting problems during feature clustering. In this work, we propose a novel MHA-WoML framework for few-shot learning, which adaptively focuses on semantically dominant regions, and well relieves the over-fitting problem. Specifically, we first design a hierarchical multi-head attention (MHA) module, which consists of three functional heads (i.e., rare head, syntactic head and positional head) with masks, to extract comprehensive image features, and screen out invalid features. The MHA behaves better than current transformers in few-shot recognition. Then, we incorporate the optimal transport theory into Wasserstein distance and propose a Wasserstein-OT metric learning (WoML) module for category clustering. The WoML module focuses more on calculating the appropriately approximate barycenter to avoid the over accurate sub-stage fitting which may threaten the global fitting, thus alleviating the problem of over-fitting in the training process. Experimental results show that our approach achieves remarkably better performance compared to current state-of-the-art methods by scoring about 3% higher accuracy, across four benchmark datasets including MiniImageNet, TieredImageNet, CIFAR-FS and CUB200. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Multimedia Information Retrieval Springer Journals

MHA-WoML: Multi-head attention and Wasserstein-OT for few-shot learning

Loading next page...
 
/lp/springer-journals/mha-woml-multi-head-attention-and-wasserstein-ot-for-few-shot-learning-VzOTEHlMl0
Publisher
Springer Journals
Copyright
Copyright © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
ISSN
2192-6611
eISSN
2192-662X
DOI
10.1007/s13735-022-00254-5
Publisher site
See Article on Publisher Site

Abstract

Few-shot learning aims to classify novel classes with extreme few labeled samples. Existing metric-learning-based approaches tend to employ the off-the-shelf CNN models for feature extraction, and conventional clustering algorithms for feature matching. These methods neglect the importance of image regions and might trap in over-fitting problems during feature clustering. In this work, we propose a novel MHA-WoML framework for few-shot learning, which adaptively focuses on semantically dominant regions, and well relieves the over-fitting problem. Specifically, we first design a hierarchical multi-head attention (MHA) module, which consists of three functional heads (i.e., rare head, syntactic head and positional head) with masks, to extract comprehensive image features, and screen out invalid features. The MHA behaves better than current transformers in few-shot recognition. Then, we incorporate the optimal transport theory into Wasserstein distance and propose a Wasserstein-OT metric learning (WoML) module for category clustering. The WoML module focuses more on calculating the appropriately approximate barycenter to avoid the over accurate sub-stage fitting which may threaten the global fitting, thus alleviating the problem of over-fitting in the training process. Experimental results show that our approach achieves remarkably better performance compared to current state-of-the-art methods by scoring about 3% higher accuracy, across four benchmark datasets including MiniImageNet, TieredImageNet, CIFAR-FS and CUB200.

Journal

International Journal of Multimedia Information RetrievalSpringer Journals

Published: Sep 21, 2022

Keywords: Few-shot learning; Hierarchical multi-head attention; Wasserstein distance; Optimal transport theory

References