2022-08-18 更新
Object Discovery via Contrastive Learning for Weakly Supervised Object Detection
Authors:Jinhwan Seo, Wonho Bae, Danica J. Sutherland, Junhyug Noh, Daijin Kim
Weakly Supervised Object Detection (WSOD) is a task that detects objects in an image using a model trained only on image-level annotations. Current state-of-the-art models benefit from self-supervised instance-level supervision, but since weak supervision does not include count or location information, the most common ``argmax’’ labeling method often ignores many instances of objects. To alleviate this issue, we propose a novel multiple instance labeling method called object discovery. We further introduce a new contrastive loss under weak supervision where no instance-level information is available for sampling, called weakly supervised contrastive loss (WSCL). WSCL aims to construct a credible similarity threshold for object discovery by leveraging consistent features for embedding vectors in the same class. As a result, we achieve new state-of-the-art results on MS-COCO 2014 and 2017 as well as PASCAL VOC 2012, and competitive results on PASCAL VOC 2007.
PDF Accepted at ECCV 2022. For project page, see https://jinhseo.github.io/research/wsod.html For code, see https://github.com/jinhseo/OD-WSCL
点此查看论文截图
Hierarchical Attention Network for Few-Shot Object Detection via Meta-Contrastive Learning
Authors:Dongwoo Park, Jong-Min Lee
Few-shot object detection (FSOD) aims to classify and detect few images of novel categories. Existing meta-learning methods insufficiently exploit features between support and query images owing to structural limitations. We propose a hierarchical attention network with sequentially large receptive fields to fully exploit the query and support images. In addition, meta-learning does not distinguish the categories well because it determines whether the support and query images match. In other words, metric-based learning for classification is ineffective because it does not work directly. Thus, we propose a contrastive learning method called meta-contrastive learning, which directly helps achieve the purpose of the meta-learning strategy. Finally, we establish a new state-of-the-art network, by realizing significant margins. Our method brings 2.3, 1.0, 1.3, 3.4 and 2.4% AP improvements for 1-30 shots object detection on COCO dataset. Our code is available at: https://github.com/infinity7428/hANMCL
PDF
点此查看论文截图
Category-Level Pose Retrieval with Contrastive Features Learnt with Occlusion Augmentation
Authors:Georgios Kouros, Shubham Shrivastava, Cédric Picron, Sushruth Nagesh, Punarjay Chakravarty, Tinne Tuytelaars
Pose estimation is usually tackled as either a bin classification problem or as a regression problem. In both cases, the idea is to directly predict the pose of an object. This is a non-trivial task because of appearance variations of similar poses and similarities between different poses. Instead, we follow the key idea that it is easier to compare two poses than to estimate them. Render-and-compare approaches have been employed to that end, however, they tend to be unstable, computationally expensive, and slow for real-time applications. We propose doing category-level pose estimation by learning an alignment metric using a contrastive loss with a dynamic margin and a continuous pose-label space. For efficient inference, we use a simple real-time image retrieval scheme with a reference set of renderings projected to an embedding space. To achieve robustness to real-world conditions, we employ synthetic occlusions, bounding box perturbations, and appearance augmentations. Our approach achieves state-of-the-art performance on PASCAL3D and OccludedPASCAL3D, as well as high-quality results on KITTI3D.
PDF 14 pages, 6 Figures,