# 2022-09-06 更新

### Confidence Estimation for Object Detection in Document Images

Authors:Mélodie Boillet, Christopher Kermorvant, Thierry Paquet

Deep neural networks are becoming increasingly powerful and large and always require more labelled data to be trained. However, since annotating data is time-consuming, it is now necessary to develop systems that show good performance while learning on a limited amount of data. These data must be correctly chosen to obtain models that are still efficient. For this, the systems must be able to determine which data should be annotated to achieve the best results. In this paper, we propose four estimators to estimate the confidence of object detection predictions. The first two are based on Monte Carlo dropout, the third one on descriptive statistics and the last one on the detector posterior probabilities. In the active learning framework, the three first estimators show a significant improvement in performance for the detection of document physical pages and text lines compared to a random selection of images. We also show that the proposed estimator based on descriptive statistics can replace MC dropout, reducing the computational cost without compromising the performances.
PDF

### Generalised Co-Salient Object Detection

Authors:Jiawei Liu, Jing Zhang, Ruikai Cui, Kaihao Zhang, Nick Barnes

We propose a new setting that relaxes the assumption in the conventional CoSOD setting by allowing the presence of \enquote{noisy images} which do not share the common salient object. We call this new setting Generalised Co-Salient Object Detection (GCoSOD). We propose a novel random sampling based Generalised CoSOD Training (GCT) strategy to distill the awareness of inter-image absence of co-salient object into CoSOD models. It employs a Diverse Sampling Self-Supervised Learning (D$\text{S}^{3}$L) that, in addition to the provided supervised co-salient label, introduces additional self-supervised labels for images (being null that no co-salient object is present). Further, the random sampling process inherent in GCT enables the generation of a high-quality uncertainty map highlighting potential false-positive predictions at instance level. To evaluate the performance of CoSOD models under the GCoSOD setting, we propose two new testing datasets, namely CoCA-Common and CoCA-Zero, where a common salient object is partially present in the former and completely absent in the latter. Extensive experiments demonstrate that our proposed method significantly improves the performance of CoSOD models in terms of the performance under the GCoSOD setting as well as the model calibration degrees.
PDF

### Disentangle and Remerge: Interventional Knowledge Distillation for Few-Shot Object Detection from A Conditional Causal Perspective

Authors:Jiangmeng Li, Yanan Zhang, Wenwen Qiang, Lingyu Si, Chengbo Jiao, Xiaohui Hu, Changwen Zheng, Fuchun Sun

Few-shot learning models learn representations with limited human annotations, and such a learning paradigm demonstrates practicability in various tasks, e.g., image classification, object detection, etc. However, few-shot object detection methods suffer from an intrinsic defect that the limited training data makes the model cannot sufficiently explore semantic information. To tackle this, we introduce knowledge distillation to the few-shot object detection learning paradigm. We further run a motivating experiment, which demonstrates that in the process of knowledge distillation the empirical error of the teacher model degenerates the prediction performance of the few-shot object detection model, as the student. To understand the reasons behind this phenomenon, we revisit the learning paradigm of knowledge distillation on the few-shot object detection task from the causal theoretic standpoint, and accordingly, develop a Structural Causal Model. Following the theoretical guidance, we propose a backdoor adjustment-based knowledge distillation method for the few-shot object detection task, namely Disentangle and Remerge (D&R), to perform conditional causal intervention toward the corresponding Structural Causal Model. Theoretically, we provide an extended definition, i.e., general backdoor path, for the backdoor criterion, which can expand the theoretical application boundary of the backdoor criterion in specific cases. Empirically, the experiments on multiple benchmark datasets demonstrate that D&R can yield significant performance boosts in few-shot object detection.
PDF

### Improvised Aerial Object Detection approach for YOLOv3 Using Weighted Luminance

Authors:Sai Ganesh CS, Aouthithiye Barathwaj SR Y, R. Swethaa S, R. Azhagumurugan

Remote sensing is the image acquisition of a target without having physical contact with it. Nowadays remote sensing data is widely preferred due to its reduced image acquisition period. The remote sensing of ground targets is more challenging because of the various factors that affect the propagation of light through different mediums from a satellite acquisition. Several Convolutional Neural Network-based algorithms are being implemented in the field of remote sensing. Supervised learning is a machine learning technique where the data is labelled according to their classes prior to the training. In order to detect and classify the targets more accurately, YOLOv3, an algorithm based on bounding and anchor boxes is adopted. In order to handle the various effects of light travelling through the atmosphere, Grayscale based YOLOv3 configuration is introduced. For better prediction and for solving the Rayleigh scattering effect, RGB based grayscale algorithms are proposed. The acquired images are analysed and trained with the grayscale based YOLO3 algorithm for target detection. The results show that the grayscale-based method can sense the target more accurately and effectively than the traditional YOLOv3 approach.
PDF 14 pages, 4 figures, Journal Expert Systems with Applications

### Instance-Aware Observer Network for Out-of-Distribution Object Segmentation

Authors:Victor Besnier, Andrei Bursuc, David Picard, Alexandre Briot

Recent works on predictive uncertainty estimation have shown promising results on Out-Of-Distribution (OOD) detection for semantic segmentation. However, these methods struggle to precisely locate the point of interest in the image, i.e, the anomaly. This limitation is due to the difficulty of finegrained prediction at the pixel level. To address this issue, we build upon the recent ObsNet approach by providing object instance knowledge to the observer. We extend ObsNet by harnessing an instance-wise mask prediction. We use an additional, class agnostic, object detector to filter and aggregate observer predictions. Finally, we predict an unique anomaly score for each instance in the image. We show that our proposed method accurately disentangles in-distribution objects from OOD objects on three datasets.
PDF

目录