Overview¶

Number of checkpoints: 179
Number of configs: 177
Number of papers: 33
- ALGORITHM: 28
- BACKBONE: 1
- DATASET: 3
- OTHERS: 1

For supported datasets, see datasets overview.

Spatio Temporal Action Detection Models ¶

Number of checkpoints: 25
Number of configs: 27
Number of papers: 5
- [ALGORITHM] Ava: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions (-> -> ->)
- [ALGORITHM] Slowfast Networks for Video Recognition (->)
- [ALGORITHM] The Ava-Kinetics Localized Human Actions Video Dataset (->)
- [DATASET] Ava: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions (-> -> ->)
- [DATASET] The Ava-Kinetics Localized Human Actions Video Dataset (->)

Action Localization Models ¶

Number of checkpoints: 2
Number of configs: 2
Number of papers: 3
- [ALGORITHM] Bmn: Boundary-Matching Network for Temporal Action Proposal Generation (->)
- [ALGORITHM] Bsn: Boundary Sensitive Network for Temporal Action Proposal Generation (->)
- [DATASET] Cuhk & Ethz & Siat Submission to Activitynet Challenge 2017 (->)

Action Recognition Models ¶

Number of checkpoints: 114
Number of configs: 111
Number of papers: 22
- [ALGORITHM] A Closer Look at Spatiotemporal Convolutions for Action Recognition (->)
- [ALGORITHM] Audiovisual Slowfast Networks for Video Recognition (->)
- [ALGORITHM] Is Space-Time Attention All You Need for Video Understanding? (->)
- [ALGORITHM] Learning Spatiotemporal Features With 3d Convolutional Networks (->)
- [ALGORITHM] Mvitv2: Improved Multiscale Vision Transformers for Classification and Detection (->)
- [ALGORITHM] Non-Local Neural Networks (-> -> ->)
- [ALGORITHM] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (->)
- [ALGORITHM] Slowfast Networks for Video Recognition (-> -> ->)
- [ALGORITHM] Tam: Temporal Adaptive Module for Video Recognition (->)
- [ALGORITHM] Temporal Interlacing Network (->)
- [ALGORITHM] Temporal Pyramid Network for Action Recognition (->)
- [ALGORITHM] Temporal Relational Reasoning in Videos (->)
- [ALGORITHM] Temporal Segment Networks: Towards Good Practices for Deep Action Recognition (->)
- [ALGORITHM] Tsm: Temporal Shift Module for Efficient Video Understanding (->)
- [ALGORITHM] Uniformer: Unified Transformer for Efficient Spatial-Temporal Representation Learning (->)
- [ALGORITHM] Uniformerv2: Spatiotemporal Learning by Arming Image Vits With Video Uniformer (->)
- [ALGORITHM] Video Classification With Channel-Separated Convolutional Networks (->)
- [ALGORITHM] Video Swin Transformer (->)
- [ALGORITHM] Video{mae (->)
- [ALGORITHM] X3d: Expanding Architectures for Efficient Video Recognition (->)
- [BACKBONE] Non-Local Neural Networks (-> -> ->)
- [OTHERS] Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition (->)

Skeleton-based Action Recognition Models ¶

Number of checkpoints: 38
Number of configs: 37
Number of papers: 4
- [ALGORITHM] Pyskl: Towards Good Practices for Skeleton Action Recognition (->)
- [ALGORITHM] Revisiting Skeleton-Based Action Recognition (->)
- [ALGORITHM] Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition (->)
- [ALGORITHM] Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition (->)

Overview¶

Spatio Temporal Action Detection Models¶

Action Localization Models¶

Action Recognition Models¶

Skeleton-based Action Recognition Models¶

Spatio Temporal Action Detection Models ¶

Action Localization Models ¶

Action Recognition Models ¶

Skeleton-based Action Recognition Models ¶