Overview¶
Number of checkpoints: 179
Number of configs: 177
Number of papers: 33
ALGORITHM: 28
BACKBONE: 1
DATASET: 3
OTHERS: 1
For supported datasets, see datasets overview.
Spatio Temporal Action Detection Models¶
Number of checkpoints: 25
Number of configs: 27
Number of papers: 5
[ALGORITHM] Ava: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions (-> -> ->)
[ALGORITHM] Slowfast Networks for Video Recognition (->)
[ALGORITHM] The Ava-Kinetics Localized Human Actions Video Dataset (->)
[DATASET] Ava: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions (-> -> ->)
[DATASET] The Ava-Kinetics Localized Human Actions Video Dataset (->)
Action Localization Models¶
Number of checkpoints: 2
Number of configs: 2
Number of papers: 3
[ALGORITHM] Bmn: Boundary-Matching Network for Temporal Action Proposal Generation (->)
[ALGORITHM] Bsn: Boundary Sensitive Network for Temporal Action Proposal Generation (->)
[DATASET] Cuhk & Ethz & Siat Submission to Activitynet Challenge 2017 (->)
Action Recognition Models¶
Number of checkpoints: 114
Number of configs: 111
Number of papers: 22
[ALGORITHM] A Closer Look at Spatiotemporal Convolutions for Action Recognition (->)
[ALGORITHM] Audiovisual Slowfast Networks for Video Recognition (->)
[ALGORITHM] Is Space-Time Attention All You Need for Video Understanding? (->)
[ALGORITHM] Learning Spatiotemporal Features With 3d Convolutional Networks (->)
[ALGORITHM] Mvitv2: Improved Multiscale Vision Transformers for Classification and Detection (->)
[ALGORITHM] Non-Local Neural Networks (-> -> ->)
[ALGORITHM] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (->)
[ALGORITHM] Slowfast Networks for Video Recognition (-> -> ->)
[ALGORITHM] Tam: Temporal Adaptive Module for Video Recognition (->)
[ALGORITHM] Temporal Interlacing Network (->)
[ALGORITHM] Temporal Pyramid Network for Action Recognition (->)
[ALGORITHM] Temporal Relational Reasoning in Videos (->)
[ALGORITHM] Temporal Segment Networks: Towards Good Practices for Deep Action Recognition (->)
[ALGORITHM] Tsm: Temporal Shift Module for Efficient Video Understanding (->)
[ALGORITHM] Uniformer: Unified Transformer for Efficient Spatial-Temporal Representation Learning (->)
[ALGORITHM] Uniformerv2: Spatiotemporal Learning by Arming Image Vits With Video Uniformer (->)
[ALGORITHM] Video Classification With Channel-Separated Convolutional Networks (->)
[ALGORITHM] Video Swin Transformer (->)
[ALGORITHM] Video{mae (->)
[ALGORITHM] X3d: Expanding Architectures for Efficient Video Recognition (->)
[BACKBONE] Non-Local Neural Networks (-> -> ->)
[OTHERS] Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition (->)
Skeleton-based Action Recognition Models¶
Number of checkpoints: 38
Number of configs: 37
Number of papers: 4
[ALGORITHM] Pyskl: Towards Good Practices for Skeleton Action Recognition (->)
[ALGORITHM] Revisiting Skeleton-Based Action Recognition (->)
[ALGORITHM] Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition (->)
[ALGORITHM] Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition (->)