Authors:Yipeng Gao, Lingxiao Yang, Yunmu Huang, Song Xie, Shiyong Li, Wei-shi Zheng
Under the domain shift, cross-domain few-shot object detection aims to adapt object detectors in the target domain with a few annotated target data. There exists two significant challenges: (1) Highly insufficient target domain data; (2) Potential over-adaptation and misleading caused by inappropriately amplified target samples without any restriction. To address these challenges, we propose an adaptive method consisting of two parts. First, we propose an adaptive optimization strategy to select augmented data similar to target samples rather than blindly increasing the amount. Specifically, we filter the augmented candidates which significantly deviate from the target feature distribution in the very beginning. Second, to further relieve the data limitation, we propose the multi-level domain-aware data augmentation to increase the diversity and rationality of augmented data, which exploits the cross-image foreground-background mixture. Experiments show that the proposed method achieves state-of-the-art performance on multiple benchmarks.
PDF Accepted in ECCV 2022
AirFi: Empowering WiFi-based Passive Human Gesture Recognition to Unseen Environment via Domain Generalization
Authors:Dazhuo Wang, Jianfei Yang, Wei Cui, Lihua Xie, Sumei Sun
WiFi-based smart human sensing technology enabled by Channel State Information (CSI) has received great attention in recent years. However, CSI-based sensing systems suffer from performance degradation when deployed in different environments. Existing works solve this problem by domain adaptation using massive unlabeled high-quality data from the new environment, which is usually unavailable in practice. In this paper, we propose a novel augmented environment-invariant robust WiFi gesture recognition system named AirFi that deals with the issue of environment dependency from a new perspective. The AirFi is a novel domain generalization framework that learns the critical part of CSI regardless of different environments and generalizes the model to unseen scenarios, which does not require collecting any data for adaptation to the new environment. AirFi extracts the common features from several training environment settings and minimizes the distribution differences among them. The feature is further augmented to be more robust to environments. Moreover, the system can be further improved by few-shot learning techniques. Compared to state-of-the-art methods, AirFi is able to work in different environment settings without acquiring any CSI data from the new environment. The experimental results demonstrate that our system remains robust in the new environment and outperforms the compared systems.
PDF 11 pages, 3 figures, paper under review
Authors:Hui Su, Xiao Zhou, Houjing Yu, Yuwen Chen, Zilin Zhu, Yang Yu, Jie Zhou
Large Language Models pre-trained with self-supervised learning have demonstrated impressive zero-shot generalization capabilities on a wide spectrum of tasks. In this work, we present WeLM: a well-read pre-trained language model for Chinese that is able to seamlessly perform different types of tasks with zero or few-shot demonstrations. WeLM is trained with 10B parameters by “reading” a curated high-quality corpus covering a wide range of topics. We show that WeLM is equipped with broad knowledge on various domains and languages. On 18 monolingual (Chinese) tasks, WeLM can significantly outperform existing pre-trained models with similar sizes and match the performance of models up to 25 times larger. WeLM also exhibits strong capabilities in multi-lingual and code-switching understanding, outperforming existing multilingual language models pre-trained on 30 languages. Furthermore, We collected human-written prompts for a large set of supervised datasets in Chinese and fine-tuned WeLM with multi-prompted training. The resulting model can attain strong generalization on unseen types of tasks and outperform the unsupervised WeLM in zero-shot learning. Finally, we demonstrate that WeLM has basic skills at explaining and calibrating the decisions from itself, which can be promising directions for future research. Our models can be applied from https://welm.weixin.qq.com/docs/api/.
Authors:Lewis Tunstall, Nils Reimers, Unso Eun Seo Jo, Luke Bates, Daniel Korat, Moshe Wasserblat, Oren Pereg
Recent few-shot methods, such as parameter-efficient fine-tuning (PEFT) and pattern exploiting training (PET), have achieved impressive results in label-scarce settings. However, they are difficult to employ since they are subject to high variability from manually crafted prompts, and typically require billion-parameter language models to achieve high accuracy. To address these shortcomings, we propose SetFit (Sentence Transformer Fine-tuning), an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers (ST). SetFit works by first fine-tuning a pretrained ST on a small number of text pairs, in a contrastive Siamese manner. The resulting model is then used to generate rich text embeddings, which are used to train a classification head. This simple framework requires no prompts or verbalizers, and achieves high accuracy with orders of magnitude less parameters than existing techniques. Our experiments show that SetFit obtains comparable results with PEFT and PET techniques, while being an order of magnitude faster to train. We also show that SetFit can be applied in multilingual settings by simply switching the ST body. Our code is available at https://github.com/huggingface/setfit and our datasets at https://huggingface.co/setfit .