Domain Adaptation


2023-06-04 更新

Maximal Domain Independent Representations Improve Transfer Learning

Authors:Adrian Shuai Li, Elisa Bertino, Xuan-Hong Dang, Ankush Singla, Yuhai Tu, Mark N Wegman

Domain adaptation (DA) adapts a training dataset from a source domain for use in a learning task in a target domain in combination with data available at the target. One popular approach for DA is to create a domain-independent representation (DIRep) learned by a generator from all input samples and then train a classifier on top of it using all labeled samples. A domain discriminator is added to train the generator adversarially to exclude domain specific features from the DIRep. However, this approach tends to generate insufficient information for accurate classification learning. In this paper, we present a novel approach that integrates the adversarial model with a variational autoencoder. In addition to the DIRep, we introduce a domain-dependent representation (DDRep) such that information from both DIRep and DDRep is sufficient to reconstruct samples from both domains. We further penalize the size of the DDRep to drive as much information as possible to the DIRep, which maximizes the accuracy of the classifier in labeling samples in both domains. We empirically evaluate our model using synthetic datasets and demonstrate that spurious class-related features introduced in the source domain are successfully absorbed by the DDRep. This leaves a rich and clean DIRep for accurate transfer learning in the target domain. We further demonstrate its superior performance against other algorithms for a number of common image datasets. We also show we can take advantage of pretrained models.
PDF

点此查看论文截图

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

Authors:Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang, Dacheng Tao, Li Guo

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to handle a variety of task-oriented dialogue domains without the cost of collecting in-domain data. Existing works mainly study common data- or model-level augmentation methods to enhance the generalization but fail to effectively decouple the semantics of samples, limiting the zero-shot performance of DST. In this paper, we present a simple and effective “divide, conquer and combine” solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. Specifically, we divide the seen data into semantically independent subsets and train corresponding experts, the newly unseen samples are mapped and inferred with mixture-of-experts with our designed ensemble inference. Extensive experiments on MultiWOZ2.1 upon the T5-Adapter show our schema significantly and consistently improves the zero-shot performance, achieving the SOTA on settings without external knowledge, with only 10M trainable parameters1.
PDF Accepted to ACL 2023

点此查看论文截图

Neuronal Cell Type Classification using Deep Learning

Authors:Ofek Ophir, Orit Shefi, Ofir Lindenbaum

The brain is likely the most complex organ, given the variety of functions it controls, the number of cells it comprises, and their corresponding diversity. Studying and identifying neurons, the brain’s primary building blocks, is a crucial milestone and essential for understanding brain function in health and disease. Recent developments in machine learning have provided advanced abilities for classifying neurons. However, these methods remain black boxes with no explainability and reasoning. This paper aims to provide a robust and explainable deep-learning framework to classify neurons based on their electrophysiological activity. Our analysis is performed on data provided by the Allen Cell Types database containing a survey of biological features derived from single-cell recordings of mice and humans. First, we classify neuronal cell types of mice data to identify excitatory and inhibitory neurons. Then, neurons are categorized to their broad types in humans using domain adaptation from mice data. Lastly, neurons are classified into sub-types based on transgenic mouse lines using deep neural networks in an explainable fashion. We show state-of-the-art results in a dendrite-type classification of excitatory vs. inhibitory neurons and transgenic mouse lines classification. The model is also inherently interpretable, revealing the correlations between neuronal types and their electrophysiological properties.
PDF

点此查看论文截图

FACT: Federated Adversarial Cross Training

Authors:Stefan Schrod, Jonas Lippl, Andreas Schäfer, Michael Altenbuchinger

Federated Learning (FL) facilitates distributed model development to aggregate multiple confidential data sources. The information transfer among clients can be compromised by distributional differences, i.e., by non-i.i.d. data. A particularly challenging scenario is the federated model adaptation to a target client without access to annotated data. We propose Federated Adversarial Cross Training (FACT), which uses the implicit domain differences between source clients to identify domain shifts in the target domain. In each round of FL, FACT cross initializes a pair of source clients to generate domain specialized representations which are then used as a direct adversary to learn a domain invariant data representation. We empirically show that FACT outperforms state-of-the-art federated, non-federated and source-free domain adaptation models on three popular multi-source-single-target benchmarks, and state-of-the-art Unsupervised Domain Adaptation (UDA) models on single-source-single-target experiments. We further study FACT’s behavior with respect to communication restrictions and the number of participating clients.
PDF

点此查看论文截图

Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection

Authors:Erik Arakelyan, Arnav Arora, Isabelle Augenstein

Stance Detection is concerned with identifying the attitudes expressed by an author towards a target of interest. This task spans a variety of domains ranging from social media opinion identification to detecting the stance for a legal claim. However, the framing of the task varies within these domains, in terms of the data collection protocol, the label dictionary and the number of available annotations. Furthermore, these stance annotations are significantly imbalanced on a per-topic and inter-topic basis. These make multi-domain stance detection a challenging task, requiring standardization and domain adaptation. To overcome this challenge, we propose $\textbf{T}$opic $\textbf{E}$fficient $\textbf{St}$anc$\textbf{E}$ $\textbf{D}$etection (TESTED), consisting of a topic-guided diversity sampling technique and a contrastive objective that is used for fine-tuning a stance classifier. We evaluate the method on an existing benchmark of $16$ datasets with in-domain, i.e. all topics seen and out-of-domain, i.e. unseen topics, experiments. The results show that our method outperforms the state-of-the-art with an average of $3.5$ F1 points increase in-domain, and is more generalizable with an averaged increase of $10.2$ F1 on out-of-domain evaluation while using $\leq10\%$ of the training data. We show that our sampling technique mitigates both inter- and per-topic class imbalances. Finally, our analysis demonstrates that the contrastive learning objective allows the model a more pronounced segmentation of samples with varying labels.
PDF ACL 2023 (Oral)

点此查看论文截图

Knowledge-based Reasoning and Learning under Partial Observability in Ad Hoc Teamwork

Authors:Hasra Dodampegama, Mohan Sridharan

Ad hoc teamwork refers to the problem of enabling an agent to collaborate with teammates without prior coordination. Data-driven methods represent the state of the art in ad hoc teamwork. They use a large labeled dataset of prior observations to model the behavior of other agent types and to determine the ad hoc agent’s behavior. These methods are computationally expensive, lack transparency, and make it difficult to adapt to previously unseen changes, e.g., in team composition. Our recent work introduced an architecture that determined an ad hoc agent’s behavior based on non-monotonic logical reasoning with prior commonsense domain knowledge and predictive models of other agents’ behavior that were learned from limited examples. In this paper, we substantially expand the architecture’s capabilities to support: (a) online selection, adaptation, and learning of the models that predict the other agents’ behavior; and (b) collaboration with teammates in the presence of partial observability and limited communication. We illustrate and experimentally evaluate the capabilities of our architecture in two simulated multiagent benchmark domains for ad hoc teamwork: Fort Attack and Half Field Offense. We show that the performance of our architecture is comparable or better than state of the art data-driven baselines in both simple and complex scenarios, particularly in the presence of limited training data, partial observability, and changes in team composition.
PDF 17 pages, 3 Figures

点此查看论文截图

MOSAIC: Masked Optimisation with Selective Attention for Image Reconstruction

Authors:Pamuditha Somarathne, Tharindu Wickremasinghe, Amashi Niwarthana, A. Thieshanthan, Chamira U. S. Edussooriya, Dushan N. Wadduwage

Compressive sensing (CS) reconstructs images from sub-Nyquist measurements by solving a sparsity-regularized inverse problem. Traditional CS solvers use iterative optimizers with hand crafted sparsifiers, while early data-driven methods directly learn an inverse mapping from the low-dimensional measurement space to the original image space. The latter outperforms the former, but is restrictive to a pre-defined measurement domain. More recent, deep unrolling methods combine traditional proximal gradient methods and data-driven approaches to iteratively refine an image approximation. To achieve higher accuracy, it has also been suggested to learn both the sampling matrix, and the choice of measurement vectors adaptively. Contrary to the current trend, in this work we hypothesize that a general inverse mapping from a random set of compressed measurements to the image domain exists for a given measurement basis, and can be learned. Such a model is single-shot, non-restrictive and does not parametrize the sampling process. To this end, we propose MOSAIC, a novel compressive sensing framework to reconstruct images given any random selection of measurements, sampled using a fixed basis. Motivated by the uneven distribution of information across measurements, MOSAIC incorporates an embedding technique to efficiently apply attention mechanisms on an encoded sequence of measurements, while dispensing the need to use unrolled deep networks. A range of experiments validate our proposed architecture as a promising alternative for existing CS reconstruction methods, by achieving the state-of-the-art for metrics of reconstruction accuracy on standard datasets.
PDF

点此查看论文截图

文章作者: 木子已
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 木子已 !
  目录