Domain Adaptation


2022-11-12 更新

The Enforced Transfer: An Instance-Based Divide-and-Conquer Unsupervised Domain Adaptation Algorithm

Authors:Ye Gao, Brian Baucom, Karen Rose, Kristina Gordon, Hongning Wang, John Stankovic

Existing Domain Adaptation (DA) algorithms train target models to classify all samples in the target domain, but it fails to recognize the possibility that, within the target domain, some samples are closer to the source domain and thus should be classified by source domain models. In this paper, we develop a novel unsupervised DA algorithm, the Enforced Transfer, which employs an out-of-distribution detection algorithm to decide which model (i.e., source domain or target domain) to apply on the testing instance, i.e., divide-and-conquer. Instead of choosing the models at the instance-level, we make the choice of models at the layers of deep models. On three types of DA tasks, we outperform the state-of-the-art algorithms.
PDF

点此查看论文截图

Scalable Modular Synthetic Data Generation for Advancing Aerial Autonomy

Authors:Mehrnaz Sabet, Praveen Palanisamy, Sakshi Mishra

Harnessing the benefits of drones for urban innovation at scale requires reliable aerial autonomy. One major barrier to advancing aerial autonomy has been collecting large-scale aerial datasets for training machine learning models. Due to costly and time-consuming real-world data collection through deploying drones, there has been an increasing shift towards using synthetic data for training models in drone applications. However, to increase generalizability of trained policies on synthetic data, incorporating domain randomization into the data generation workflow for addressing the sim-to-real problem becomes crucial. Current synthetic data generation tools either lack domain randomization or rely heavily on manual workload or real samples for configuring and generating diverse realistic simulation scenes. These dependencies limit scalability of the data generation workflow. Accordingly, there is a major challenge in balancing generalizability and scalability in synthetic data generation. To address these gaps, we introduce a modular scalable data generation workflow tailored to aerial autonomy applications. To generate realistic configurations of simulation scenes while increasing diversity, we present an adaptive layered domain randomization approach that creates a type-agnostic distribution space for assets over the base map of the environments before pose generation for drone trajectory. We leverage high-level scene structures to automatically place assets in valid configurations and then extend the diversity through obstacle generation and global parameter randomization. We demonstrate the effectiveness of our method in automatically generating diverse configurations and datasets and show its potential for downstream performance optimization. Our work contributes to generating enhanced benchmark datasets for training models that can generalize better to real-world situations.
PDF 22 pages, 10 figures

点此查看论文截图

Syntax-Guided Domain Adaptation for Aspect-based Sentiment Analysis

Authors:Anguo Dong, Cuiyun Gao, Yan Jia, Qing Liao, Xuan Wang, Lei Wang, Jing Xiao

Aspect-based sentiment analysis (ABSA) aims at extracting opinionated aspect terms in review texts and determining their sentiment polarities, which is widely studied in both academia and industry. As a fine-grained classification task, the annotation cost is extremely high. Domain adaptation is a popular solution to alleviate the data deficiency issue in new domains by transferring common knowledge across domains. Most cross-domain ABSA studies are based on structure correspondence learning (SCL), and use pivot features to construct auxiliary tasks for narrowing down the gap between domains. However, their pivot-based auxiliary tasks can only transfer knowledge of aspect terms but not sentiment, limiting the performance of existing models. In this work, we propose a novel Syntax-guided Domain Adaptation Model, named SDAM, for more effective cross-domain ABSA. SDAM exploits syntactic structure similarities for building pseudo training instances, during which aspect terms of target domain are explicitly related to sentiment polarities. Besides, we propose a syntax-based BERT mask language model for further capturing domain-invariant features. Finally, to alleviate the sentiment inconsistency issue in multi-gram aspect terms, we introduce a span-based joint aspect term and sentiment analysis module into the cross-domain End2End ABSA. Experiments on five benchmark datasets show that our model consistently outperforms the state-of-the-art baselines with respect to Micro-F1 metric for the cross-domain End2End ABSA task.
PDF

点此查看论文截图

Estimating Soft Labels for Out-of-Domain Intent Detection

Authors:Hao Lang, Yinhe Zheng, Jian Sun, Fei Huang, Luo Si, Yongbin Li

Out-of-Domain (OOD) intent detection is important for practical dialog systems. To alleviate the issue of lacking OOD training samples, some works propose synthesizing pseudo OOD samples and directly assigning one-hot OOD labels to these pseudo samples. However, these one-hot labels introduce noises to the training process because some hard pseudo OOD samples may coincide with In-Domain (IND) intents. In this paper, we propose an adaptive soft pseudo labeling (ASoul) method that can estimate soft labels for pseudo OOD samples when training OOD detectors. Semantic connections between pseudo OOD samples and IND intents are captured using an embedding graph. A co-training framework is further introduced to produce resulting soft labels following the smoothness assumption, i.e., close samples are likely to have similar labels. Extensive experiments on three benchmark datasets show that ASoul consistently improves the OOD detection performance and outperforms various competitive baselines.
PDF EMNLP2022 Main Track Long Paper (Oral presentation)

点此查看论文截图

文章作者: 木子已
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 木子已 !
  目录