2022-10-21 更新
QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation
Authors:Zhenrui Yue, Huimin Zeng, Bernhard Kratzwald, Stefan Feuerriegel, Dong Wang
Question answering (QA) has recently shown impressive results for answering questions from customized domains. Yet, a common challenge is to adapt QA models to an unseen target domain. In this paper, we propose a novel self-supervised framework called QADA for QA domain adaptation. QADA introduces a novel data augmentation pipeline used to augment training QA samples. Different from existing methods, we enrich the samples via hidden space augmentation. For questions, we introduce multi-hop synonyms and sample augmented token embeddings with Dirichlet distributions. For contexts, we develop an augmentation method which learns to drop context spans via a custom attentive sampling strategy. Additionally, contrastive learning is integrated in the proposed self-supervised adaptation framework QADA. Unlike existing approaches, we generate pseudo labels and propose to train the model via a novel attention-based contrastive adaptation method. The attention weights are used to build informative features for discrepancy estimation that helps the QA model separate answers and generalize across source and target domains. To the best of our knowledge, our work is the first to leverage hidden space augmentation and attention-based contrastive adaptation for self-supervised domain adaptation in QA. Our evaluation shows that QADA achieves considerable improvements on multiple target datasets over state-of-the-art baselines in QA domain adaptation.
PDF Accepted to EMNLP 2022
点此查看论文截图
Using Language to Extend to Unseen Domains
Authors:Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell, Joseph E. Gonzalez, Aditi Raghunathan, Anja Rohrbach
It is expensive to collect training data for every possible domain that a vision model may encounter when deployed. We instead consider how simply verbalizing the training domain (e.g. “photos of birds”) as well as domains we want to extend to but do not have data for (e.g. “paintings of birds”) can improve robustness. Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain, while preserving task relevant information. Without using any images from the unseen test domain, we show that over the extended domain containing both training and unseen test domains, LADS outperforms standard fine-tuning and ensemble approaches over a suite of four benchmarks targeting domain adaptation and dataset bias
PDF
点此查看论文截图
Creating a Forensic Database of Shoeprints from Online Shoe Tread Photos
Authors:Samia Shafique, Bailey Kong, Shu Kong, Charless C. Fowlkes
Shoe tread impressions are one of the most common types of evidence left at crime scenes. However, the utility of such evidence is limited by the lack of databases of footwear prints that cover the large and growing number of distinct shoe models. Moreover, the database is preferred to contain the 3D shape, or depth, of shoe-tread photos so as to allow for extracting shoeprints to match a query (crime-scene) print. We propose to address this gap by leveraging shoe-tread photos collected by online retailers. The core challenge is to predict depth maps for these photos. As they do not have ground-truth 3D shapes allowing for training depth predictors, we exploit synthetic data that does. We develop a method termed ShoeRinsics that learns to predict depth by leveraging a mix of fully supervised synthetic data and unsupervised retail image data. In particular, we find domain adaptation and intrinsic image decomposition techniques effectively mitigate the synthetic-real domain gap and yield significantly better depth prediction. To validate our method, we introduce 2 validation sets consisting of shoe-tread image and print pairs and define a benchmarking protocol to quantify the quality of predicted depth. On this benchmark, ShoeRinsics outperforms existing methods of depth prediction and synthetic-to-real domain adaptation.
PDF published in WACV 2023; 8 pages including 11 figures and 3 tables; contains reference and appendix
点此查看论文截图
RAIS: Robust and Accurate Interactive Segmentation via Continual Learning
Authors:Yuying Hao, Yi Liu, Juncai Peng, Haoyi Xiong, Guowei Chen, Shiyu Tang, Zeyu Chen, Baohua Lai
Interactive image segmentation aims at segmenting a target region through a way of human-computer interaction. Recent works based on deep learning have achieved excellent performance, while most of them focus on improving the accuracy of the training set and ignore potential improvement on the test set. In the inference phase, they tend to have a good performance on similar domains to the training set, and lack adaptability to domain shift, so they require more user efforts to obtain satisfactory results. In this work, we propose RAIS, a robust and accurate architecture for interactive segmentation with continuous learning, where the model can learn from both train and test data sets. For efficient learning on the test set, we propose a novel optimization strategy to update global and local parameters with a basic segmentation module and adaptation module, respectively. Moreover, we perform extensive experiments on several benchmarks that show our method can handle data distribution shifts and achieves SOTA performance compared with recent interactive segmentation methods. Besides, our method also shows its robustness in the datasets of remote sensing and medical imaging where the data domains are completely different between training and testing.
PDF 8 pages
点此查看论文截图
ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler
Authors:Jiaxin Zhang, Yashar Moshfeghi
Numerical reasoning over text is a challenging task of Artificial Intelligence (AI), requiring reading comprehension and numerical reasoning abilities. Previous approaches use numerical reasoning programs to represent the reasoning process. However, most works do not separate the generation of operators and operands, which are key components of a numerical reasoning program, thus limiting their ability to generate such programs for complicated tasks. In this paper, we introduce the numEricaL reASoning with adapTive symbolIc Compiler (ELASTIC) model, which is constituted of the RoBERTa as the Encoder and a Compiler with four modules: Reasoning Manager, Operator Generator, Operands Generator, and Memory Register. ELASTIC is robust when conducting complicated reasoning. Also, it is domain agnostic by supporting the expansion of diverse operators without caring about the number of operands it contains. Experiments show that ELASTIC achieves 68.96 and 65.21 of execution accuracy and program accuracy on the FinQA dataset and 83.00 program accuracy on the MathQA dataset, outperforming previous state-of-the-art models significantly.
PDF Accepted to NeurIPS 2022
点此查看论文截图
CiteSum: Citation Text-guided Scientific Extreme Summarization and Domain Adaptation with Limited Supervision
Authors:Yuning Mao, Ming Zhong, Jiawei Han
Scientific extreme summarization (TLDR) aims to form ultra-short summaries of scientific papers. Previous efforts on curating scientific TLDR datasets failed to scale up due to the heavy human annotation and domain expertise required. In this paper, we propose a simple yet effective approach to automatically extracting TLDR summaries for scientific papers from their citation texts. Based on the proposed approach, we create a new benchmark CiteSum without human annotation, which is around 30 times larger than the previous human-curated dataset SciTLDR. We conduct a comprehensive analysis of CiteSum, examining its data characteristics and establishing strong baselines. We further demonstrate the usefulness of CiteSum by adapting models pre-trained on CiteSum (named CITES) to new tasks and domains with limited supervision. For scientific extreme summarization, CITES outperforms most fully-supervised methods on SciTLDR without any fine-tuning and obtains state-of-the-art results with only 128 examples. For news extreme summarization, CITES achieves significant gains on XSum over its base model (not pre-trained on CiteSum), e.g., +7.2 ROUGE-1 zero-shot performance and state-of-the-art few-shot performance. For news headline generation, CITES performs the best among unsupervised and zero-shot methods on Gigaword. Our dataset and code can be found at https://github.com/morningmoni/CiteSum.
PDF EMNLP 2022. TLDR: By pretraining on (automatically extracted) citation sentences in scientific papers, we achieve SOTA on SciTLDR, XSum, and Gigaword in zero-shot and (or) few-shot settings
点此查看论文截图
Few-shot Transferable Robust Representation Learning via Bilevel Attacks
Authors:Minseon Kim, Hyeonjeong Ha, Sung Ju Hwang
Existing adversarial learning methods for enhancing the robustness of deep neural networks assume the availability of a large amount of data from which we can generate adversarial examples. However, in an adversarial meta-learning setting, the model needs to train with only a few adversarial examples to learn a robust model for unseen tasks, which is a very difficult goal to achieve. Further, learning transferable robust representations for unseen domains is a difficult problem even with a large amount of data. To tackle such a challenge, we propose a novel adversarial self-supervised meta-learning framework with bilevel attacks which aims to learn robust representations that can generalize across tasks and domains. Specifically, in the inner loop, we update the parameters of the given encoder by taking inner gradient steps using two different sets of augmented samples, and generate adversarial examples for each view by maximizing the instance classification loss. Then, in the outer loop, we meta-learn the encoder parameter to maximize the agreement between the two adversarial examples, which enables it to learn robust representations. We experimentally validate the effectiveness of our approach on unseen domain adaptation tasks, on which it achieves impressive performance. Specifically, our method significantly outperforms the state-of-the-art meta-adversarial learning methods on few-shot learning tasks, as well as self-supervised learning baselines in standard learning settings with large-scale datasets.
PDF *Equal contribution. Author ordering determined by coin flip
点此查看论文截图
Dialogue-adaptive Language Model Pre-training From Quality Estimation
Authors:Junlong Li, Zhuosheng Zhang, Hai Zhao
Pre-trained language models (PrLMs) have achieved great success on a wide range of natural language processing tasks by virtue of the universal language representation ability obtained by self-supervised learning on a large corpus. These models are pre-trained on standard plain texts with general language model (LM) training objectives, which would be insufficient to model dialogue-exclusive attributes like specificity and informativeness reflected in these tasks that are not explicitly captured by the pre-trained universal language representations. In this work, we propose dialogue-adaptive pre-training objectives (DAPO) derived from quality estimation to simulate dialogue-specific features, namely coherence, specificity, and informativeness. As the foundation for model pre-training, we synthesize a new dialogue corpus and build our training set with two unsupervised methods: 1) coherence-oriented context corruption, including utterance ordering, insertion, and replacement, to help the model capture the coherence inside the dialogue contexts; and 2) specificity-oriented automatic rescoring, which encourages the model to measure the quality of the synthesized data for dialogue-adaptive pre-training by considering specificity and informativeness. Experimental results on widely used open-domain response selection and quality estimation benchmarks show that DAPO significantly improves the baseline models and achieves state-of-the-art performance on the MuTual leaderboard, verifying the effectiveness of estimating quality evaluation factors into pre-training.
PDF Accepted by Neurocomputing
点此查看论文截图
OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping
Authors:Junshi Xia, Naoto Yokoya, Bruno Adriano, Clifford Broni-Bediako
We introduce OpenEarthMap, a benchmark dataset, for global high-resolution land cover mapping. OpenEarthMap consists of 2.2 million segments of 5000 aerial and satellite images covering 97 regions from 44 countries across 6 continents, with manually annotated 8-class land cover labels at a 0.25—0.5m ground sampling distance. Semantic segmentation models trained on the OpenEarthMap generalize worldwide and can be used as off-the-shelf models in a variety of applications. We evaluate the performance of state-of-the-art methods for unsupervised domain adaptation and present challenging problem settings suitable for further technical development. We also investigate lightweight models using automated neural architecture search for limited computational resources and fast mapping. The dataset is available at https://open-earth-map.org.
PDF Accepted by WACV 2023
点此查看论文截图
Knowledge-based and Data-driven Reasoning and Learning for Ad Hoc Teamwork
Authors:Hasra Dodampegama, Mohan Sridharan
We present an architecture for ad hoc teamwork, which refers to collaboration in a team of agents without prior coordination. State of the art methods for this problem often include a data-driven component that uses a long history of prior observations to model the behaviour of other agents (or agent types) and to determine the ad hoc agent’s behaviour. In many practical domains, it is challenging to find large training datasets, and necessary to understand and incrementally extend the existing models to account for changes in team composition or domain attributes. Our architecture combines the principles of knowledge-based and data-driven reasoning and learning. Specifically, we enable an ad hoc agent to perform non-monotonic logical reasoning with prior commonsense domain knowledge and incrementally-updated simple predictive models of other agents’ behaviour. We use the benchmark simulated multi-agent collaboration domain Fort Attack to demonstrate that our architecture supports adaptation to unforeseen changes, incremental learning and revision of models of other agents’ behaviour from limited samples, transparency in the ad hoc agent’s decision making, and better performance than a data-driven baseline.
PDF Presented at the AI-HRI Symposium at AAAI Fall Symposium Series(FSS), 2022 (arXiv:cs/2209.14292)