视频理解

发布日期: 2022-05-03

2022-05-03 更新

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

Authors:Fadime Sener, Dibyadip Chatterjee, Daniel Shelepov, Kun He, Dipika Singhania, Robert Wang, Angela Yao

Assembly101 is a new procedural activity dataset featuring 4321 videos of people assembling and disassembling 101 “take-apart” toy vehicles. Participants work without fixed instructions, and the sequences feature rich and natural variations in action ordering, mistakes, and corrections. Assembly101 is the first multi-view action dataset, with simultaneous static (8) and egocentric (4) recordings. Sequences are annotated with more than 100K coarse and 1M fine-grained action segments, and 18M 3D hand poses. We benchmark on three action understanding tasks: recognition, anticipation and temporal segmentation. Additionally, we propose a novel task of detecting mistakes. The unique recording format and rich set of annotations allow us to investigate generalization to new toys, cross-view transfer, long-tailed distributions, and pose vs. appearance. We envision that Assembly101 will serve as a new challenge to investigate various activity understanding problems.
PDF CVPR 2022, https://assembly-101.github.io/

论文截图

木子已

https://ipaper.today/2022/05/03/2022-05-03-shi-pin-li-jie/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源木子已 !

视频理解

GAN

2022-05-03 GAN

GAN

I2I Translation

2022-05-03 I2I Translation

I2I Translation

视频理解

2022-05-03 更新

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

打赏用于支持本站流量费