Few-Shot


2023-05-16 更新

A transformer-based method for zero and few-shot biomedical named entity recognition

Authors:Miloš Košprdić, Nikola Prodanović, Adela Ljajić, Bojana Bašaragin, Nikola Milošević

Supervised named entity recognition (NER) in the biomedical domain is dependent on large sets of annotated texts with the given named entities, whose creation can be time-consuming and expensive. Furthermore, the extraction of new entities often requires conducting additional annotation tasks and retraining the model. To address these challenges, this paper proposes a transformer-based method for zero- and few-shot NER in the biomedical domain. The method is based on transforming the task of multi-class token classification into binary token classification (token contains the searched entity or does not contain the searched entity) and pre-training on a larger amount of datasets and biomedical entities, from where the method can learn semantic relations between the given and potential classes. We have achieved average F1 scores of 35.44% for zero-shot NER, 50.10% for one-shot NER, 69.94% for 10-shot NER, and 79.51% for 100-shot NER on 9 diverse evaluated biomedical entities with PubMedBERT fine-tuned model. The results demonstrate the effectiveness of the proposed method for recognizing new entities with limited examples, with comparable or better results from the state-of-the-art zero- and few-shot NER methods.
PDF Collaboration between Bayer Pharma R&D and Serbian Institute for Artificial Intelligence Research and Development

点此查看论文截图

Evaluating Open-Domain Question Answering in the Era of Large Language Models

Authors:Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei

Lexical matching remains the de facto evaluation method for open-domain question answering (QA). Unfortunately, lexical matching fails completely when a plausible candidate answer does not appear in the list of gold answers, which is increasingly the case as we shift from extractive to generative models. The recent success of large language models (LLMs) for QA aggravates lexical matching failures since candidate answers become longer, thereby making matching with the gold answers even more challenging. Without accurate evaluation, the true progress in open-domain QA remains unknown. In this paper, we conduct a thorough analysis of various open-domain QA models, including LLMs, by manually evaluating their answers on a subset of NQ-open, a popular benchmark. Our assessments reveal that while the true performance of all models is significantly underestimated, the performance of the InstructGPT (zero-shot) LLM increases by nearly +60%, making it on par with existing top models, and the InstructGPT (few-shot) model actually achieves a new state-of-the-art on NQ-open. We also find that more than 50% of lexical matching failures are attributed to semantically equivalent answers. We further demonstrate that regex matching ranks QA models consistent with human judgments, although still suffering from unnecessary strictness. Finally, we demonstrate that automated evaluation models are a reasonable surrogate for lexical matching in some circumstances, but not for long-form answers generated by LLMs. The automated models struggle in detecting hallucinations in LLM answers and are thus unable to evaluate LLMs. At this time, there appears to be no substitute for human evaluation.
PDF ACL 2023; code and data released at https://github.com/ehsk/OpenQA-eval

点此查看论文截图

Quaternion-valued Correlation Learning for Few-Shot Semantic Segmentation

Authors:Zewen Zheng, Guoheng Huang, Xiaochen Yuan, Chi-Man Pun, Hongrui Liu, Wing-Kuen Ling

Few-shot segmentation (FSS) aims to segment unseen classes given only a few annotated samples. Encouraging progress has been made for FSS by leveraging semantic features learned from base classes with sufficient training samples to represent novel classes. The correlation-based methods lack the ability to consider interaction of the two subspace matching scores due to the inherent nature of the real-valued 2D convolutions. In this paper, we introduce a quaternion perspective on correlation learning and propose a novel Quaternion-valued Correlation Learning Network (QCLNet), with the aim to alleviate the computational burden of high-dimensional correlation tensor and explore internal latent interaction between query and support images by leveraging operations defined by the established quaternion algebra. Specifically, our QCLNet is formulated as a hyper-complex valued network and represents correlation tensors in the quaternion domain, which uses quaternion-valued convolution to explore the external relations of query subspace when considering the hidden relationship of the support sub-dimension in the quaternion space. Extensive experiments on the PASCAL-5i and COCO-20i datasets demonstrate that our method outperforms the existing state-of-the-art methods effectively. Our code is available at https://github.com/zwzheng98/QCLNet
PDF for associated paper file, see https://ieeexplore.ieee.org/document/9954424?source=authoralert

点此查看论文截图

Answering Complex Questions over Text by Hybrid Question Parsing and Execution

Authors:Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou

The dominant paradigm of textual question answering systems is based on end-to-end neural networks, which excels at answering natural language questions but falls short on complex ones. This stands in contrast to the broad adaptation of semantic parsing approaches over structured data sources (e.g., relational database, knowledge graphs), that convert natural language questions to logical forms and execute them with query engines. Towards combining the strengths of neural and symbolic methods, we propose a framework of question parsing and execution on textual QA. It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question. Hence, the proposed framework can be viewed as a top-down question parsing followed by a bottom-up answer backtracking. The resulting H-expressions closely guide the execution process, offering higher precision besides better interpretability while still preserving the advantages of the neural readers for resolving its primitive elements. Our extensive experiments on MuSiQue, 2WikiQA, HotpotQA, and NQ show that the proposed parsing and hybrid execution framework outperforms existing approaches in supervised, few-shot, and zero-shot settings, while also effectively exposing its underlying reasoning process.
PDF

点此查看论文截图

Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives

Authors:Qiushi Sun, Chengcheng Han, Nuo Chen, Renyu Zhu, Jingyang Gong, Xiang Li, Ming Gao

Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks. However, tuning these models for downstream tasks usually needs exorbitant costs or is unavailable due to commercial considerations. Recently, black-box tuning has been proposed to address this problem by optimizing task-specific prompts without accessing the gradients and hidden representations. However, most existing works have yet fully exploited the potential of gradient-free optimization under the scenario of few-shot learning. In this paper, we describe BBT-RGB, a suite of straightforward and complementary techniques for enhancing the efficiency and performance of black-box optimization. Specifically, our method includes three plug-and-play components: (1) Two-stage derivative-free optimization strategy that facilitates fast convergence and mitigates overfitting; (2) Automatic verbalizer construction with its novel usage under few-shot settings; (3) Better prompt initialization policy based on instruction search and auto-selected demonstration. Extensive experiments across various tasks on natural language understanding and inference demonstrate the effectiveness of our method. Our codes are publicly available at https://github.com/QiushiSun/BBT-RGB.
PDF Work in progress

点此查看论文截图

Meta-DM: Applications of Diffusion Models on Few-Shot Learning

Authors:Wentao Hu, Xiurong Jiang, Jiarun Liu, Yuqi Yang, Hui Tian

In the field of few-shot learning (FSL), extensive research has focused on improving network structures and training strategies. However, the role of data processing modules has not been fully explored. Therefore, in this paper, we propose Meta-DM, a generalized data processing module for FSL problems based on diffusion models. Meta-DM is a simple yet effective module that can be easily integrated with existing FSL methods, leading to significant performance improvements in both supervised and unsupervised settings. We provide a theoretical analysis of Meta-DM and evaluate its performance on several algorithms. Our experiments show that combining Meta-DM with certain methods achieves state-of-the-art results.
PDF

点此查看论文截图

STORYWARS: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation

Authors:Yulun Du, Lydia Chilton

Collaborative stories, which are texts created through the collaborative efforts of multiple authors with different writing styles and intentions, pose unique challenges for NLP models. Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce STORYWARS, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. We design 12 task types, comprising 7 understanding and 5 generation task types, on STORYWARS, deriving 101 diverse story-related tasks in total as a multi-task benchmark covering all fully-supervised, few-shot, and zero-shot scenarios. Furthermore, we present our instruction-tuned model, INSTRUCTSTORY, for the story tasks showing that instruction tuning, in addition to achieving superior results in zero-shot and few-shot scenarios, can also obtain the best performance on the fully-supervised tasks in STORYWARS, establishing strong multi-task benchmark performances on STORYWARS.
PDF ACL 2023

点此查看论文截图

PLIP: Language-Image Pre-training for Person Representation Learning

Authors:Jialong Zuo, Changqian Yu, Nong Sang, Changxin Gao

Pre-training has emerged as an effective technique for learning powerful person representations. Most existing methods have shown that pre-training on pure-vision large-scale datasets like ImageNet and LUPerson has achieved remarkable performance. However, solely relying on visual information, the absence of robust explicit indicators poses a challenge for these methods to learn discriminative person representations. Drawing inspiration from the intrinsic fine-grained attribute indicators of person descriptions, we explore introducing the language modality into person representation learning. To this end, we propose a novel language-image pre-training framework for person representation learning, termed PLIP. To explicitly build fine-grained cross-modal associations, we specifically design three pretext tasks, \ie semantic-fused image colorization, visual-fused attributes prediction, and vision-language matching. In addition, due to the lack of an appropriate dataset, we present a large-scale person dataset named SYNTH-PEDES, where the Stylish Pedestrian Attributes-union Captioning method is proposed to synthesize diverse textual descriptions. We pre-train PLIP on SYNTH-PEDES and evaluate our model by spanning downstream tasks such as text-based Re-ID, image-based Re-ID, and person attribute recognition. Extensive experiments demonstrate that our model not only significantly improves existing methods on all these tasks, but also shows great ability in the few-shot and domain generalization settings. The code, dataset and weights will be released at~\url{https://github.com/Zplusdragon/PLIP}
PDF

点此查看论文截图

FeatFSDA: Towards Few-shot Domain Adaptation for Video-based Activity Recognition

Authors:Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

Domain adaptation is essential for activity recognition, as common spatiotemporal architectures risk overfitting due to increased parameters arising from the temporal dimension. Unsupervised domain adaptation methods have been extensively studied, yet, they require large-scale unlabeled data from the target domain. In this work, we address few-shot domain adaptation for video-based activity recognition (FSDA-AR), which leverages a very small amount of labeled target videos to achieve effective adaptation. This setting is attractive and promising for applications, as it requires recording and labeling only a few, or even a single example per class in the target domain, which often includes activities that are rare yet crucial to recognize. We construct FSDA-AR benchmarks using five established datasets: UCF101, HMDB51, EPIC-KITCHEN, Sims4Action, and Toyota Smart Home. Our results demonstrate that FSDA-AR performs comparably to unsupervised domain adaptation with significantly fewer (yet labeled) target examples. We further propose a novel approach, FeatFSDA, to better leverage the few labeled target domain samples as knowledge guidance. FeatFSDA incorporates a latent space semantic adjacency loss, a domain prototypical similarity loss, and a graph-attentive-network-based edge dropout technique. Our approach achieves state-of-the-art performance on all datasets within our FSDA-AR benchmark. To encourage future research of few-shot domain adaptation for video-based activity recognition, we will release our benchmarks and code at https://github.com/KPeng9510/FeatFSDA.
PDF Benchmarks and code will be released at https://github.com/KPeng9510/FeatFSDA

点此查看论文截图

文章作者: 木子已
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 木子已 !
  目录