2023-03-21 更新
Remote Task-oriented Grasp Area Teaching By Non-Experts through Interactive Segmentation and Few-Shot Learning
Authors:Furkan Kaynar, Sudarshan Rajagopalan, Shaobo Zhou, Eckehard Steinbach
A robot operating in unstructured environments must be able to discriminate between different grasping styles depending on the prospective manipulation task. Having a system that allows learning from remote non-expert demonstrations can very feasibly extend the cognitive skills of a robot for task-oriented grasping. We propose a novel two-step framework towards this aim. The first step involves grasp area estimation by segmentation. We receive grasp area demonstrations for a new task via interactive segmentation, and learn from these few demonstrations to estimate the required grasp area on an unseen scene for the given task. The second step is autonomous grasp estimation in the segmented region. To train the segmentation network for few-shot learning, we built a grasp area segmentation (GAS) dataset with 10089 images grouped into 1121 segmentation tasks. We benefit from an efficient meta learning algorithm for training for few-shot adaptation. Experimental evaluation showed that our method successfully detects the correct grasp area on the respective objects in unseen test scenes and effectively allows remote teaching of new grasp strategies by non-experts.
PDF Presented at the AAAI Workshop on Artificial Intelligence for User-Centric Assistance in at-Home Tasks (2023)
点此查看论文截图
Identification of Novel Classes for Improving Few-Shot Object Detection
Authors:Zeyu Shangguan, Mohammad Rostami
Conventional training of deep neural networks requires a large number of the annotated image which is a laborious and time-consuming task, particularly for rare objects. Few-shot object detection (FSOD) methods offer a remedy by realizing robust object detection using only a few training samples per class. An unexplored challenge for FSOD is that instances from unlabeled novel classes that do not belong to the fixed set of training classes appear in the background. These objects behave similarly to label noise, leading to FSOD performance degradation. We develop a semi-supervised algorithm to detect and then utilize these unlabeled novel objects as positive samples during training to improve FSOD performance. Specifically, we propose a hierarchical ternary classification region proposal network (HTRPN) to localize the potential unlabeled novel objects and assign them new objectness labels. Our improved hierarchical sampling strategy for the region proposal network (RPN) also boosts the perception ability of the object detection model for large objects. Our experimental results indicate that our method is effective and outperforms the existing state-of-the-art (SOTA) FSOD methods.
PDF
点此查看论文截图
Augmenting and Aligning Snippets for Few-Shot Video Domain Adaptation
Authors:Yuecong Xu, Jianfei Yang, Yunjiao Zhou, Zhenghua Chen, Min Wu, Xiaoli Li
For video models to be transferred and applied seamlessly across video tasks in varied environments, Video Unsupervised Domain Adaptation (VUDA) has been introduced to improve the robustness and transferability of video models. However, current VUDA methods rely on a vast amount of high-quality unlabeled target data, which may not be available in real-world cases. We thus consider a more realistic \textit{Few-Shot Video-based Domain Adaptation} (FSVDA) scenario where we adapt video models with only a few target video samples. While a few methods have touched upon Few-Shot Domain Adaptation (FSDA) in images and in FSVDA, they rely primarily on spatial augmentation for target domain expansion with alignment performed statistically at the instance level. However, videos contain more knowledge in terms of rich temporal and semantic information, which should be fully considered while augmenting target domains and performing alignment in FSVDA. We propose a novel SSA2lign to address FSVDA at the snippet level, where the target domain is expanded through a simple snippet-level augmentation followed by the attentive alignment of snippets both semantically and statistically, where semantic alignment of snippets is conducted through multiple perspectives. Empirical results demonstrate state-of-the-art performance of SSA2lign across multiple cross-domain action recognition benchmarks.
PDF 15 pages, 9 tables, 5 figures
点此查看论文截图
Decomposed Prototype Learning for Few-Shot Scene Graph Generation
Authors:Xingchen Li, Long Chen, Guikun Chen, Yinfu Feng, Yi Yang, Jun Xiao
Today’s scene graph generation (SGG) models typically require abundant manual annotations to learn new predicate types. Thus, it is difficult to apply them to real-world applications with a long-tailed distribution of predicates. In this paper, we focus on a new promising task of SGG: few-shot SGG (FSSGG). FSSGG encourages models to be able to quickly transfer previous knowledge and recognize novel predicates well with only a few examples. Although many advanced approaches have achieved great success on few-shot learning (FSL) tasks, straightforwardly extending them into FSSGG is not applicable due to two intrinsic characteristics of predicate concepts: 1) Each predicate category commonly has multiple semantic meanings under different contexts. 2) The visual appearance of relation triplets with the same predicate differs greatly under different subject-object pairs. Both issues make it hard to model conventional latent representations for predicate categories with state-of-the-art FSL methods. To this end, we propose a novel Decomposed Prototype Learning (DPL). Specifically, we first construct a decomposable prototype space to capture intrinsic visual patterns of subjects and objects for predicates, and enhance their feature representations with these decomposed prototypes. Then, we devise an intelligent metric learner to assign adaptive weights to each support sample by considering the relevance of their subject-object pairs. We further re-split the VG dataset and compare DPL with various FSL methods to benchmark this task. Extensive results show that DPL achieves excellent performance in both base and novel categories.
PDF