2022-11-10 更新
EEG-Fest: Few-shot based Attention Network for Driver’s Vigilance Estimation with EEG Signals
Authors:Ning Ding, Ce Zhang, Azim Eskandarian
A lack of driver’s vigilance is the main cause of most vehicle crashes. Electroencephalography(EEG) has been reliable and efficient tool for drivers’ drowsiness estimation. Even though previous studies have developed accurate and robust driver’s vigilance detection algorithms, these methods are still facing challenges on following areas: (a) small sample size training, (b) anomaly signal detection, and (c) subject-independent classification. In this paper, we propose a generalized few-shot model, namely EEG-Fest, to improve aforementioned drawbacks. The EEG-Fest model can (a) classify the query sample’s drowsiness with a few samples, (b) identify whether a query sample is anomaly signals or not, and (c) achieve subject independent classification. The proposed algorithm achieves state-of-the-art results on the SEED-VIG dataset and the SADT dataset. The accuracy of the drowsy class achieves 92% and 94% for 1-shot and 5-shot support samples in the SEED-VIG dataset, and 62% and 78% for 1-shot and 5-shot support samples in the SADT dataset.
PDF Submitted to peer review journal for review
点此查看论文截图
ConsPrompt: Easily Exploiting Contrastive Samples for Few-shot Prompt Learning
Authors:Jinta Weng, Yue Hu, Zhihong Tian, Heyan Huang
Prompt learning recently become an effective linguistic tool to motivate the PLMs’ knowledge on few-shot-setting tasks. However, studies have shown the lack of robustness still exists in prompt learning, since suitable initialization of continuous prompt and expert-first manual prompt are essential in fine-tuning process. What is more, human also utilize their comparative ability to motivate their existing knowledge for distinguishing different examples. Motivated by this, we explore how to use contrastive samples to strengthen prompt learning. In detail, we first propose our model ConsPrompt combining with prompt encoding network, contrastive sampling module, and contrastive scoring module. Subsequently, two sampling strategies, similarity-based and label-based strategies, are introduced to realize differential contrastive learning. The effectiveness of proposed ConsPrompt is demonstrated in five different few-shot learning tasks and shown the similarity-based sampling strategy is more effective than label-based in combining contrastive learning. Our results also exhibits the state-of-the-art performance and robustness in different few-shot settings, which proves that the ConsPrompt could be assumed as a better knowledge probe to motivate PLMs.
PDF
点此查看论文截图
Elimination of Non-Novel Segments at Multi-Scale for Few-Shot Segmentation
Authors:Alper Kayabaşı, Gülin Tüfekci, İlkay Ulusoy
Few-shot segmentation aims to devise a generalizing model that segments query images from unseen classes during training with the guidance of a few support images whose class tally with the class of the query. There exist two domain-specific problems mentioned in the previous works, namely spatial inconsistency and bias towards seen classes. Taking the former problem into account, our method compares the support feature map with the query feature map at multi scales to become scale-agnostic. As a solution to the latter problem, a supervised model, called as base learner, is trained on available classes to accurately identify pixels belonging to seen classes. Hence, subsequent meta learner has a chance to discard areas belonging to seen classes with the help of an ensemble learning model that coordinates meta learner with the base learner. We simultaneously address these two vital problems for the first time and achieve state-of-the-art performances on both PASCAL-5i and COCO-20i datasets.
PDF Accepted to WACV 2023
点此查看论文截图
RAPS: A Novel Few-Shot Relation Extraction Pipeline with Query-Information Guided Attention and Adaptive Prototype Fusion
Authors:Yuzhe Zhang, Min Cen, Tongzhou Wu, Hong Zhang
Few-shot relation extraction (FSRE) aims at recognizing unseen relations by learning with merely a handful of annotated instances. To generalize to new relations more effectively, this paper proposes a novel pipeline for the FSRE task based on queRy-information guided Attention and adaptive Prototype fuSion, namely RAPS. Specifically, RAPS first derives the relation prototype by the query-information guided attention module, which exploits rich interactive information between the support instances and the query instances, in order to obtain more accurate initial prototype representations. Then RAPS elaborately combines the derived initial prototype with the relation information by the adaptive prototype fusion mechanism to get the integrated prototype for both train and prediction. Experiments on the benchmark dataset FewRel 1.0 show a significant improvement of our method against state-of-the-art methods.
PDF 9 pages, 2 figures
点此查看论文截图
FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments
Authors:Jishnu Jaykumar P, Yu-Wei Chao, Yu Xiang
We introduce the Few-Shot Object Learning (FewSOL) dataset for object recognition with a few images per object. We captured 336 real-world objects with 9 RGB-D images per object from different views. Object segmentation masks, object poses and object attributes are provided. In addition, synthetic images generated using 330 3D object models are used to augment the dataset. We investigated (i) few-shot object classification and (ii) joint object segmentation and few-shot classification with the state-of-the-art methods for few-shot learning and meta-learning using our dataset. The evaluation results show that there is still a large margin to be improved for few-shot object classification in robotic environments. Our dataset can be used to study a set of few-shot object recognition problems such as classification, detection and segmentation, shape reconstruction, pose estimation, keypoint correspondences and attribute recognition. The dataset and code are available at https://irvlutd.github.io/FewSOL.
PDF
点此查看论文截图
Prototypical quadruplet for few-shot class incremental learning
Authors:Sanchar Palit, Biplab Banerjee, Subhasis Chaudhuri
Many modern computer vision algorithms suffer from two major bottlenecks: scarcity of data and learning new tasks incrementally. While training the model with new batches of data the model looses it’s ability to classify the previous data judiciously which is termed as catastrophic forgetting. Conventional methods have tried to mitigate catastrophic forgetting of the previously learned data while the training at the current session has been compromised. The state-of-the-art generative replay based approaches use complicated structures such as generative adversarial network (GAN) to deal with catastrophic forgetting. Additionally, training a GAN with few samples may lead to instability. In this work, we present a novel method to deal with these two major hurdles. Our method identifies a better embedding space with an improved contrasting loss to make classification more robust. Moreover, our approach is able to retain previously acquired knowledge in the embedding space even when trained with new classes. We update previous session class prototypes while training in such a way that it is able to represent the true class mean. This is of prime importance as our classification rule is based on the nearest class mean classification strategy. We have demonstrated our results by showing that the embedding space remains intact after training the model with new classes. We showed that our method preformed better than the existing state-of-the-art algorithms in terms of accuracy across different sessions.
PDF
点此查看论文截图
Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity Recognition
Authors:Youcheng Huang, Wenqiang Lei, Jie Fu, Jiancheng Lv
Incorporating large-scale pre-trained models with the prototypical neural networks is a de-facto paradigm in few-shot named entity recognition. Existing methods, unfortunately, are not aware of the fact that embeddings from pre-trained models contain a prominently large amount of information regarding word frequencies, biasing prototypical neural networks against learning word entities. This discrepancy constrains the two models’ synergy. Thus, we propose a one-line-code normalization method to reconcile such a mismatch with empirical and theoretical grounds. Our experiments based on nine benchmark datasets show the superiority of our method over the counterpart models and are comparable to the state-of-the-art methods. In addition to the model enhancement, our work also provides an analytical viewpoint for addressing the general problems in few-shot name entity recognition or other tasks that rely on pre-trained models or prototypical neural networks.
PDF Findings of EMNLP 2022
点此查看论文截图
Simple Questions Generate Named Entity Recognition Datasets
Authors:Hyunjae Kim, Jaehyo Yoo, Seunghyun Yoon, Jinhyuk Lee, Jaewoo Kang
Recent named entity recognition (NER) models often rely on human-annotated datasets, requiring the significant engagement of professional knowledge on the target domain and entities. This research introduces an ask-to-generate approach that automatically generates NER datasets by asking questions in simple natural language to an open-domain question answering system (e.g., “Which disease?”). Despite using fewer in-domain resources, our models, solely trained on the generated datasets, largely outperform strong low-resource models by an average F1 score of 19.4 for six popular NER benchmarks. Furthermore, our models provide competitive performance with rich-resource models that additionally leverage in-domain dictionaries provided by domain experts. In few-shot NER, we outperform the previous best model by an F1 score of 5.2 on three benchmarks and achieve new state-of-the-art performance.
PDF EMNLP 2022. Code and datasets available at https://github.com/dmis-lab/GeNER
点此查看论文截图
KGTN-ens: Few-Shot Image Classification with Knowledge Graph Ensembles
Authors:Dominik Filipiak, Anna Fensel, Agata Filipowska
We propose KGTN-ens, a framework extending the recent Knowledge Graph Transfer Network (KGTN) in order to incorporate multiple knowledge graph embeddings at a small cost. We evaluate it with different combinations of embeddings in a few-shot image classification task. We also construct a new knowledge source - Wikidata embeddings - and evaluate it with KGTN and KGTN-ens. Our approach outperforms KGTN in terms of the top-5 accuracy on the ImageNet-FS dataset for the majority of tested settings.
PDF
点此查看论文截图
How Optimal is Greedy Decoding for Extractive Question Answering?
Authors:Or Castel, Ori Ram, Avia Efrat, Omer Levy
Fine-tuned language models use greedy decoding to answer reading comprehension questions with relative success. However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one. Does greedy decoding actually perform worse than an algorithm that does adhere to these properties? To study the performance and optimality of greedy decoding, we present exact-extract, a decoding algorithm that efficiently finds the most probable answer span in the context. We compare the performance of T5 with both decoding algorithms on zero-shot and few-shot extractive question answering. When no training examples are available, exact-extract significantly outperforms greedy decoding. However, greedy decoding quickly converges towards the performance of exact-extract with the introduction of a few training examples, becoming more extractive and increasingly likelier to generate the most probable span as the training set grows. We also show that self-supervised training can bias the model towards extractive behavior, increasing performance in the zero-shot setting without resorting to annotated examples. Overall, our results suggest that pretrained language models are so good at adapting to extractive question answering, that it is often enough to fine-tune on a small training set for the greedy algorithm to emulate the optimal decoding strategy.
PDF AKBC 2022 12 pages, 3 figures
点此查看论文截图
Collaboration of Pre-trained Models Makes Better Few-shot Learner
Authors:Renrui Zhang, Bohao Li, Wei Zhang, Hao Dong, Hongsheng Li, Peng Gao, Yu Qiao
Few-shot classification requires deep neural networks to learn generalized representations only from limited training images, which is challenging but significant in low-data regimes. Recently, CLIP-based methods have shown promising few-shot performance benefited from the contrastive language-image pre-training. Based on this point, we question if the large-scale pre-training can alleviate the few-shot data deficiency and also assist the representation learning by the pre-learned knowledge. In this paper, we propose CoMo, a Collaboration of pre-trained Models that incorporates diverse prior knowledge from various pre-training paradigms for better few-shot learning. Our CoMo includes: CLIP’s language-contrastive knowledge, DINO’s vision-contrastive knowledge, and DALL-E’s language-generative knowledge. Specifically, CoMo works in two aspects: few-shot data expansion and diverse knowledge ensemble. For one, we generate synthetic images via zero-shot DALL-E to enrich the few-shot training data without any manpower. For the other, we introduce a learnable Multi-Knowledge Adapter (MK-Adapter) to adaptively blend the predictions from CLIP and DINO. By such collaboration, CoMo can fully unleash the potential of different pre-training methods and unify them to perform state-of-the-art for few-shot classification. We conduct extensive experiments on 11 datasets to demonstrate the superiority and generalization ability of our approach.
PDF 10 pages, 6 figures
点此查看论文截图
Alleviating Sparsity of Open Knowledge Graphs with Ternary Contrastive Learning
Authors:Qian Li, Shafiq Joty, Daling Wang, Shi Feng, Yifei Zhang
Sparsity of formal knowledge and roughness of non-ontological construction make sparsity problem particularly prominent in Open Knowledge Graphs (OpenKGs). Due to sparse links, learning effective representation for few-shot entities becomes difficult. We hypothesize that by introducing negative samples, a contrastive learning (CL) formulation could be beneficial in such scenarios. However, existing CL methods model KG triplets as binary objects of entities ignoring the relation-guided ternary propagation patterns and they are too generic, i.e., they ignore zero-shot, few-shot and synonymity problems that appear in OpenKGs. To address this, we propose TernaryCL, a CL framework based on ternary propagation patterns among head, relation and tail. TernaryCL designs Contrastive Entity and Contrastive Relation to mine ternary discriminative features with both negative entities and relations, introduces Contrastive Self to help zero- and few-shot entities learn discriminative features, Contrastive Synonym to model synonymous entities, and Contrastive Fusion to aggregate graph features from multiple paths. Extensive experiments on benchmarks demonstrate the superiority of TernaryCL over state-of-the-art models.
PDF
点此查看论文截图
miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings
Authors:Tassilo Klein, Moin Nabi
This paper presents miCSE, a mutual information-based Contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. The proposed approach imposes alignment between the attention pattern of different views during contrastive learning. Learning sentence embeddings with miCSE entails enforcing the syntactic consistency across augmented views for every single sentence, making contrastive self-supervised learning more sample efficient. As a result, the proposed approach shows strong performance in the few-shot learning domain. While it achieves superior results compared to state-of-the-art methods on multiple benchmarks in few-shot learning, it is comparable in the full-shot scenario. The proposed approach is conceptually simple, easy to implement and optimize, yet empirically powerful. This study opens up avenues for efficient self-supervised learning methods that are more robust than current contrastive methods for sentence embedding.
PDF
点此查看论文截图
Few-shot Image Generation with Diffusion Models
Authors:Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan
Denoising diffusion probabilistic models (DDPMs) have been proven capable of synthesizing high-quality images with remarkable diversity when trained on large amounts of data. However, to our knowledge, few-shot image generation tasks have yet to be studied with DDPM-based approaches. Modern approaches are mainly built on Generative Adversarial Networks (GANs) and adapt models pre-trained on large source domains to target domains using a few available samples. In this paper, we make the first attempt to study when do DDPMs overfit and suffer severe diversity degradation as training data become scarce. Then we propose to adapt DDPMs pre-trained on large source domains to target domains using limited data. Our results show that utilizing knowledge from pre-trained DDPMs can significantly accelerate convergence and improve the quality and diversity of the generated images. Moreover, we propose a DDPM-based pairwise similarity loss to preserve the relative distances between generated samples during domain adaptation. In this way, we further improve the generation diversity of the proposed DDPM-based approaches. We demonstrate the effectiveness of our approaches qualitatively and quantitatively on a series of few-shot image generation tasks and achieve results better than current state-of-the-art GAN-based approaches in quality and diversity.
PDF
点此查看论文截图
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
Authors:Andreas Fürst, Elisabeth Rumetshofer, Johannes Lehner, Viet Tran, Fei Tang, Hubert Ramsauer, David Kreil, Michael Kopp, Günter Klambauer, Angela Bitto-Nemling, Sepp Hochreiter
CLIP yielded impressive results on zero-shot transfer learning tasks and is considered as a foundation model like BERT or GPT3. CLIP vision models that have a rich representation are pre-trained using the InfoNCE objective and natural language supervision before they are fine-tuned on particular tasks. Though CLIP excels at zero-shot transfer learning, it suffers from an explaining away problem, that is, it focuses on one or few features, while neglecting other relevant features. This problem is caused by insufficiently extracting the covariance structure in the original multi-modal data. We suggest to use modern Hopfield networks to tackle the problem of explaining away. Their retrieved embeddings have an enriched covariance structure derived from co-occurrences of features in the stored embeddings. However, modern Hopfield networks increase the saturation effect of the InfoNCE objective which hampers learning. We propose to use the InfoLOOB objective to mitigate this saturation effect. We introduce the novel “Contrastive Leave One Out Boost” (CLOOB), which uses modern Hopfield networks for covariance enrichment together with the InfoLOOB objective. In experiments we compare CLOOB to CLIP after pre-training on the Conceptual Captions and the YFCC dataset with respect to their zero-shot transfer learning performance on other datasets. CLOOB consistently outperforms CLIP at zero-shot transfer learning across all considered architectures and datasets.
PDF Published at NeurIPS 2022; Blog: https://ml-jku.github.io/cloob; GitHub: https://github.com/ml-jku/cloob
点此查看论文截图
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
Authors:Mo Yu, Yisi Sang, Kangsheng Pu, Zekai Wei, Han Wang, Jing Li, Yue Yu, Jie Zhou
When reading a story, humans can rapidly understand new fictional characters with a few observations, mainly by drawing analogy to fictional and real people they met before in their lives. This reflects the few-shot and meta-learning essence of humans’ inference of characters’ mental states, i.e., humans’ theory-of-mind (ToM), which is largely ignored in existing research. We fill this gap with a novel NLP benchmark, TOM-IN-AMC, the first assessment of models’ ability of meta-learning of ToM in a realistic narrative understanding scenario. Our benchmark consists of $\sim$1,000 parsed movie scripts for this purpose, each corresponding to a few-shot character understanding task; and requires models to mimic humans’ ability of fast digesting characters with a few starting scenes in a new movie. Our human study verified that humans can solve our problem by inferring characters’ mental states based on their previously seen movies; while the state-of-the-art metric-learning and meta-learning approaches adapted to our task lags 30% behind.
PDF
点此查看论文截图
Prompt-Based Metric Learning for Few-Shot NER
Authors:Yanru Chen, Yanan Zheng, Zhilin Yang
Few-shot named entity recognition (NER) targets generalizing to unseen labels and/or domains with few labeled examples. Existing metric learning methods compute token-level similarities between query and support sets, but are not able to fully incorporate label semantics into modeling. To address this issue, we propose a simple method to largely improve metric learning for NER: 1) multiple prompt schemas are designed to enhance label semantics; 2) we propose a novel architecture to effectively combine multiple prompt-based representations. Empirically, our method achieves new state-of-the-art (SOTA) results under 16 of the 18 considered settings, substantially outperforming the previous SOTA by an average of 8.84% and a maximum of 34.51% in relative gains of micro F1. Our code is available at https://github.com/AChen-qaq/ProML.
PDF