2022-10-18 更新
Prediction Calibration for Generalized Few-shot Semantic Segmentation
Authors:Zhihe Lu, Sen He, Da Li, Yi-Zhe Song, Tao Xiang
Generalized Few-shot Semantic Segmentation (GFSS) aims to segment each image pixel into either base classes with abundant training examples or novel classes with only a handful of (e.g., 1-5) training images per class. Compared to the widely studied Few-shot Semantic Segmentation FSS, which is limited to segmenting novel classes only, GFSS is much under-studied despite being more practical. Existing approach to GFSS is based on classifier parameter fusion whereby a newly trained novel class classifier and a pre-trained base class classifier are combined to form a new classifier. As the training data is dominated by base classes, this approach is inevitably biased towards the base classes. In this work, we propose a novel Prediction Calibration Network PCN to address this problem. Instead of fusing the classifier parameters, we fuse the scores produced separately by the base and novel classifiers. To ensure that the fused scores are not biased to either the base or novel classes, a new Transformer-based calibration module is introduced. It is known that the lower-level features are useful of detecting edge information in an input image than higher-level features. Thus, we build a cross-attention module that guides the classifier’s final prediction using the fused multi-level features. However, transformers are computationally demanding. Crucially, to make the proposed cross-attention module training tractable at the pixel level, this module is designed based on feature-score cross-covariance and episodically trained to be generalizable at inference time. Extensive experiments on PASCAL-$5^{i}$ and COCO-$20^{i}$ show that our PCN outperforms the state-the-the-art alternatives by large margins.
PDF Technical Report
点此查看论文截图
Weak-shot Semantic Segmentation by Transferring Semantic Affinity and Boundary
Authors:Siyuan Zhou, Li Niu, Jianlou Si, Chen Qian, Liqing Zhang
Weakly-supervised semantic segmentation (WSSS) with image-level labels has been widely studied to relieve the annotation burden of the traditional segmentation task. In this paper, we show that existing fully-annotated base categories can help segment objects of novel categories with only image-level labels, even if base categories and novel categories have no overlap. We refer to this task as weak-shot semantic segmentation, which could also be treated as WSSS with auxiliary fully-annotated categories. Recent advanced WSSS methods usually obtain class activation maps (CAMs) and refine them by affinity propagation. Based on the observation that semantic affinity and boundary are class-agnostic, we propose a method under the WSSS framework to transfer semantic affinity and boundary from base to novel categories. As a result, we find that pixel-level annotation of base categories can facilitate affinity learning and propagation, leading to higher-quality CAMs of novel categories. Extensive experiments on PASCAL VOC 2012 dataset prove that our method significantly outperforms WSSS baselines on novel categories.
PDF 29 pages, 8 figures
点此查看论文截图
Heterogeneous Feature Distillation Network for SAR Image Semantic Segmentation
Authors:Gao Mengyu, Dong Qiulei
Semantic segmentation for SAR (Synthetic Aperture Radar) images has attracted increasing attention in the remote sensing community recently, due to SAR’s all-time and all-weather imaging capability. However, SAR images are generally more difficult to be segmented than their EO (Electro-Optical) counterparts, since speckle noises and layovers are inevitably involved in SAR images. To address this problem, we investigate how to introduce EO features to assist the training of a SAR-segmentation model, and propose a heterogeneous feature distillation network for segmenting SAR images, called HFD-Net, where a SAR-segmentation student model gains knowledge from a pre-trained EO-segmentation teacher model. In the proposed HFD-Net, both the student and teacher models employ an identical architecture but different parameter configurations, and a heterogeneous feature distillation model is explored for transferring latent EO features from the teacher model to the student model and then enhancing the ability of the student model for SAR image segmentation. In addition, a heterogeneous feature alignment module is explored to aggregate multi-scale features for segmentation in each of the student model and teacher model. Extensive experimental results on two public datasets demonstrate that the proposed HFD-Net outperforms seven state-of-the-art SAR image semantic segmentation methods.
PDF