发布日期: 2022-12-01

2022-12-01 更新

Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022

Authors:Maria Escobar, Laura Daza, Cristina González, Jordi Pont-Tuset, Pablo Arbeláez

We implemented Video Swin Transformer as a base architecture for the tasks of Point-of-No-Return temporal localization and Object State Change Classification. Our method achieved competitive performance on both challenges.
PDF

点此查看论文截图

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

Authors:Stephen Su, Samuel Kwong, Qingyu Zhao, De-An Huang, Juan Carlos Niebles, Ehsan Adeli

There has been an increasing interest in multi-task learning for video understanding in recent years. In this work, we propose a generalized notion of multi-task learning by incorporating both auxiliary tasks that the model should perform well on and adversarial tasks that the model should not perform well on. We employ Necessary Condition Analysis (NCA) as a data-driven approach for deciding what category these tasks should fall in. Our novel proposed framework, Adversarial Multi-Task Neural Networks (AMT), penalizes adversarial tasks, determined by NCA to be scene recognition in the Holistic Video Understanding (HVU) dataset, to improve action recognition. This upends the common assumption that the model should always be encouraged to do well on all tasks in multi-task learning. Simultaneously, AMT still retains all the benefits of multi-task learning as a generalization of existing methods and uses object recognition as an auxiliary task to aid action recognition. We introduce two challenging Scene-Invariant test splits of HVU, where the model is evaluated on action-scene co-occurrences not encountered in training. We show that our approach improves accuracy by ~3% and encourages the model to attend to action features instead of correlation-biasing scene features.
PDF

点此查看论文截图

木子已

https://ipaper.today/2022/12/01/2022-12-01-shi-pin-li-jie/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

视频理解

GAN

2022-12-01 GAN

GAN

I2I Translation

2022-12-01 I2I Translation

I2I Translation

视频理解

2022-12-01 更新

Video Swin Transformers for Egocentric Video Understanding @ Ego4D Challenges 2022

Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

打赏用于支持本站流量费