Object Detection performance variation on compressed satellite image datasets with iquaflow
Authors:Pau Gallés, Katalin Takats, Javier Marin
A lot of work has been done to reach the best possible performance of predictive models on images. There are fewer studies about the resilience of these models when they are trained on image datasets that suffer modifications altering their original quality. Yet this is a common problem that is often encountered in the industry. A good example of that is with earth observation satellites that are capturing many images. The energy and time of connection to the earth of an orbiting satellite are limited and must be carefully used. An approach to mitigate that is to compress the images on board before downloading. The compression can be regulated depending on the intended usage of the image and the requirements of this application. We present a new software tool with the name iquaflow that is designed to study image quality and model performance variation given an alteration of the image dataset. Furthermore, we do a showcase study about oriented object detection models adoption on a public image dataset DOTA Xia_2018_CVPR given different compression levels. The optimal compression point is found and the usefulness of iquaflow becomes evident.
Motion-based Post-Processing: Using Kalman Filter to Exclude Similar Targets in Underwater Object Tracking
Authors:Yunfeng Li, Bo Wang, Ye Li, Wei Huo, Zhuoyan Liu
Visual tracker includes network and post-processing. Despite the color distortion and low contrast of underwater images, advanced trackers can still be very competitive in underwater object tracking because deep learning empowers the networks to discriminate the appearance features of the target. However, underwater object tracking also faces another problem. Underwater targets such as fish and dolphins, usually appear in groups, and creatures of the same species usually have similar expressions of appearance features, so it is challenging to distinguish the weak differences characteristics only by the network itself. The existing detection-based post-processing only reflects the results of single frame detection, but cannot locate real targets among similar targets. In this paper, we propose a new post-processing strategy based on motion, which uses Kalman filter (KF) to maintain the motion information of the target and exclude similar targets around. Specifically, we use the KF predicted box and the candidate boxes in the response map and their confidence to calculate the candidate location score to find the real target. Our method does not change the network structure, nor does it perform additional training for the tracker. It can be quickly applied to other tracking fields with similar target problem. We improved SOTA trackers based on our method, and proved the effectiveness of our method on UOT100 and UTB180. The AUC of our method for OSTrack on similar subsequences is improved by more than 3% on average, and the precision and normalization precision are improved by more than 3.5% on average. It has been proved that our method has good compatibility in dealing with similar target problems and can enhance performance of the tracker together with other methods. More details can be found in: https://github.com/LiYunfengLYF/KF_in_underwater_trackers.
EARL: An Elliptical Distribution aided Adaptive Rotation Label Assignment for Oriented Object Detection in Remote Sensing Images
Authors:Jian Guan, Mingjie Xie, Youtian Lin, Guangjun He, Pengming Feng
Label assignment is often employed in recent convolutional neural network (CNN) based detectors to determine positive or negative samples during training process. However, we note that current label assignment strategies barely consider the characteristics of targets in remote sensing images thoroughly, such as large variations in orientations, aspect ratios and scales, which lead to insufficient sampling. In this paper, an Elliptical Distribution aided Adaptive Rotation Label Assignment (EARL) is proposed to select positive samples with higher quality in orientation detectors, and yields better performance. Concretely, to avoid inadequate sampling of targets with extreme scales, an adaptive scale sampling (ADS) strategy is proposed to dynamically select samples on different feature levels according to the scales of targets. To enhance ADS, positive samples are selected following a dynamic elliptical distribution (DED), which can further exploit the orientation and shape properties of targets. Moreover, a spatial distance weighting (SDW) module is introduced to mitigate the influence from low-quality samples on detection performance. Extensive experiments on popular remote sensing datasets, such as DOTA and HRSC2016, demonstrate the effectiveness and the superiority of our proposed EARL, where without bells and whistles, it achieves 72.87 of mAP on DOTA dataset by being integrated with simple structure, which outperforms current state-of-the-art anchor-free detectors and provides comparable performance as anchor-based methods. The source code will be available at https://github.com/Justlovesmile/EARL
DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking
Authors:Yu Gao, Xi Xu, Tianji Jiang, Siyuan Chen, Yi Yang, Yufeng Yue, Mengyin Fu
Object detection and pose estimation are difficult tasks in robotics and autonomous driving. Existing object detection and pose estimation methods mostly adopt the same-dimensional data for training. For example, 2D object detection usually requires a large amount of 2D annotation data with high cost. Using high-dimensional information to supervise lower-dimensional tasks is a feasible way to reduce datasets size. In this work, the DR-WLC, a dimensionality reduction cognitive model, which can perform both object detection and pose estimation tasks at the same time is proposed. The model only requires 3D model of objects and unlabeled environment images (with or without objects) to finish the training. In addition, a bounding boxes generation strategy is also proposed to build the relationship between 3D model and 2D object detection task. Experiments show that our method can qualify the work without any manual annotations and it is easy to deploy for practical applications. Source code is at https://github.com/IN2-ViAUn/DR-WLC.
Towards Spatial Equilibrium Object Detection
Authors:Zhaohui Zheng, Yuming Chen, Qibin Hou, Xiang Li, Ming-Ming Cheng
Semantic objects are unevenly distributed over images. In this paper, we study the spatial disequilibrium problem of modern object detectors and propose to quantify this ``spatial bias’’ by measuring the detection performance over zones. Our analysis surprisingly shows that the spatial imbalance of objects has a great impact on the detection performance, limiting the robustness of detection applications. This motivates us to design a more generalized measurement, termed Spatial equilibrium Precision (SP), to better characterize the detection performance of object detectors. Furthermore, we also present a spatial equilibrium label assignment (SELA) to alleviate the spatial disequilibrium problem by injecting the prior spatial weight into the optimization process of detectors. Extensive experiments on PASCAL VOC, MS COCO, and 3 application datasets on face mask/fruit/helmet images demonstrate the advantages of our method. Our findings challenge the conventional sense of object detectors and show the indispensability of spatial equilibrium. We hope these discoveries would stimulate the community to rethink how an excellent object detector should be. All the source code, evaluation protocols, and the tutorials are publicly available at https://github.com/Zzh-tju/ZoneEval
PDF Our source codes are publicly available at https://github.com/Zzh-tju/ZoneEval