Domain Adaptation

2022-11-19 更新

PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain Adaptative Semantic Segmentation

Authors:Mu Chen, Zhedong Zheng, Yi Yang, Tat-Seng Chua

Unsupervised Domain Adaptation (UDA) aims to enhance the generalization of the learned model to other domains. The domain-invariant knowledge is transferred from the model trained on labeled source domain, e.g., video game, to unlabeled target domains, e.g., real-world scenarios, saving annotation expenses. Existing UDA methods for semantic segmentation usually focus on minimizing the inter-domain discrepancy of various levels, e.g., pixels, features, and predictions, for extracting domain-invariant knowledge. However, the primary intra-domain knowledge, such as context correlation inside an image, remains underexplored. In an attempt to fill this gap, we propose a unified pixel- and patch-wise self-supervised learning framework, called PiPa, for domain adaptive semantic segmentation that facilitates intra-image pixel-wise correlations and patch-wise semantic consistency against different contexts. The proposed framework exploits the inherent structures of intra-domain images, which: (1) explicitly encourages learning the discriminative pixel-wise features with intra-class compactness and inter-class separability, and (2) motivates the robust feature learning of the identical patch against different contexts or fluctuations. Extensive experiments verify the effectiveness of the proposed method, which obtains competitive accuracy on the two widely-used UDA benchmarks, i.e., 75.6 mIoU on GTA to Cityscapes and 68.2 mIoU on Synthia to Cityscapes. Moreover, our method is compatible with other UDA approaches to further improve the performance without introducing extra parameters.


Model-agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition

Authors:Kazuki Omi, Jun Kimata, Toru Tamaki

In this paper, we propose a multi-domain learning model for action recognition. The proposed method inserts domain-specific adapters between layers of domain-independent layers of a backbone network. Unlike a multi-head network that switches classification heads only, our model switches not only the heads, but also the adapters for facilitating to learn feature representations universal to multiple domains. Unlike prior works, the proposed method is model-agnostic and doesn’t assume model structures unlike prior works. Experimental results on three popular action recognition datasets (HMDB51, UCF101, and Kinetics-400) demonstrate that the proposed method is more effective than a multi-head architecture and more efficient than separately training models for each domain.
PDF IEICE Transactions on Information and Systems, Vol. E105-D, No. 12, Dec. 2022


Cognitive Simplification Operations Improve Text Simplification

Authors:Eytan Chamovitz, Omri Abend

Text Simplification (TS) is the task of converting a text into a form that is easier to read while maintaining the meaning of the original text. A sub-task of TS is Cognitive Simplification (CS), converting text to a form that is readily understood by people with cognitive disabilities without rendering it childish or simplistic. This sub-task has yet to be explored with neural methods in NLP, and resources for it are scarcely available. In this paper, we present a method for incorporating knowledge from the cognitive accessibility domain into a TS model, by introducing an inductive bias regarding what simplification operations to use. We show that by adding this inductive bias to a TS-trained model, it is able to adapt better to CS without ever seeing CS data, and outperform a baseline model on a traditional TS benchmark. In addition, we provide a novel test dataset for CS, and analyze the differences between CS corpora and existing TS corpora, in terms of how simplification operations are applied.
PDF 25 pages, 7 figures, 8 tables, uses emnlp2022.sty, to be published in CoNLL 2022


Cross-Mode Knowledge Adaptation for Bike Sharing Demand Prediction using Domain-Adversarial Graph Neural Networks

Authors:Yuebing Liang, Guan Huang, Zhan Zhao

For bike sharing systems, demand prediction is crucial to ensure the timely re-balancing of available bikes according to predicted demand. Existing methods for bike sharing demand prediction are mostly based on its own historical demand variation, essentially regarding it as a closed system and neglecting the interaction between different transportation modes. This is particularly important for bike sharing because it is often used to complement travel through other modes (e.g., public transit). Despite some recent progress, no existing method is capable of leveraging spatiotemporal information from multiple modes and explicitly considers the distribution discrepancy between them, which can easily lead to negative transfer. To address these challenges, this study proposes a domain-adversarial multi-relational graph neural network (DA-MRGNN) for bike sharing demand prediction with multimodal historical data as input. A temporal adversarial adaptation network is introduced to extract shareable features from demand patterns of different modes. To capture correlations between spatial units across modes, we adapt a multi-relational graph neural network (MRGNN) considering both cross-mode similarity and difference. In addition, an explainable GNN technique is developed to understand how our proposed model makes predictions. Extensive experiments are conducted using real-world bike sharing, subway and ride-hailing data from New York City. The results demonstrate the superior performance of our proposed approach compared to existing methods and the effectiveness of different model components.
PDF arXiv admin note: substantial text overlap with arXiv:2203.10961


Cross-domain Federated Adaptive Prompt Tuning for CLIP

Authors:Shangchao Su, Mingzhao Yang, Bin Li, Xiangyang Xue

Federated learning (FL) allows multiple parties to collaboratively train a global model without disclosing their data. Existing research often requires all model parameters to participate in the training procedure. However, with the advent of powerful pre-trained models, it becomes possible to achieve higher performance with fewer learnable parameters in FL. In this paper, we propose a federated adaptive prompt tuning algorithm, FedAPT, for cross-domain federated image classification scenarios with the vision-language pre-trained model, CLIP, which gives play to the strong representation ability in FL. Compared with direct federated prompt tuning, our core idea is to adaptively unlock specific domain knowledge for each test sample in order to provide them with personalized prompts. To implement this idea, we design an adaptive prompt tuning module, which consists of a global prompt, an adaptive network, and some keys. The server randomly generates a set of keys and assigns a unique key to each client. Then all clients cooperatively train the global adaptive network and global prompt with the local datasets and the frozen keys. Ultimately, the global aggregation model can assign a personalized prompt to CLIP based on the domain features of each test sample. We perform extensive experiments on two multi-domain image classification datasets. The results show that FedAPT can achieve better performance with less than 10\% of the number of parameters of the fully trained model, and the global model can perform well in different client domains simultaneously.


Secure Domain Adaptation with Multiple Sources

Authors:Serban Stan, Mohammad Rostami

Multi-source unsupervised domain adaptation (MUDA) is a framework to address the challenge of annotated data scarcity in a target domain via transferring knowledge from multiple annotated source domains. When the source domains are distributed, data privacy and security can become significant concerns and protocols may limit data sharing, yet existing MUDA methods overlook these constraints. We develop an algorithm to address MUDA when source domain data cannot be shared with the target or across the source domains. Our method is based on aligning the distributions of source and target domains indirectly via estimating the source feature embeddings and predicting over a confidence based combination of domain specific model predictions. We provide theoretical analysis to support our approach and conduct empirical experiments to demonstrate that our algorithm is effective.


VeLO: Training Versatile Learned Optimizers by Scaling Up

Authors:Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. Meta-trained with approximately four thousand TPU-months of compute on a wide variety of optimization tasks, our optimizer not only exhibits compelling performance, but optimizes in interesting and unexpected ways. It requires no hyperparameter tuning, instead automatically adapting to the specifics of the problem being optimized. We open source our learned optimizer, meta-training code, the associated train and test data, and an extensive optimizer benchmark suite with baselines at


Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Authors:Rui Liu, Yahong Han, Yaowei Wang, Qi Tian

Domain adaptive object detection (DAOD) aims to improve the generalization ability of detectors when the training and test data are from different domains. Considering the significant domain gap, some typical methods, e.g., CycleGAN-based methods, adopt the intermediate domain to bridge the source and target domains progressively. However, the CycleGAN-based intermediate domain lacks the pix- or instance-level supervision for object detection, which leads to semantic differences. To address this problem, in this paper, we introduce a Frequency Spectrum Augmentation Consistency (FSAC) framework with four different low-frequency filter operations. In this way, we can obtain a series of augmented data as the intermediate domain. Concretely, we propose a two-stage optimization framework. In the first stage, we utilize all the original and augmented source data to train an object detector. In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency. And a teacher model optimized using Mean Teacher is used to further revise the pseudo labels. In the experiment, we evaluate our method on the single- and compound- target DAOD separately, which demonstrate the effectiveness of our method.
PDF for further study


Renewing Iterative Self-labeling Domain Adaptation with Application to the Spine Motion Prediction

Authors:Gecheng Chen, Yu Zhou, Xudong Zhang, Rui Tuo

The area of transfer learning comprises supervised machine learning methods that cope with the issue when the training and testing data have different input feature spaces or distributions. In this work, we propose a novel transfer learning algorithm called Renewing Iterative Self-labeling Domain Adaptation (Re-ISDA). In this work, we propose a novel transfer learning algorithm called Renewing Iterative Self-labeling Domain Adaptation (Re-ISDA).


文章作者: 木子已
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 木子已 !