Skip to main content

Beyond to Action Recognition; A Survey on Advanced Video Understanding

August 31, 2022 · 2 min read

Vision AI Engineer

Goal of this post is to investigate Action Detection datasets such as AVA Action and AVA-Kinetics, as well as the Homage Dataset.

내용

Explore ActivityNet Challenges

Conducted initial research to grasp the trends by examining various datasets and tasks related to video understanding.
findings: I Decided to look into AVA-Kinetics and Homage.

By the way, since 2019 that SlowFast won ActivityNet Challenges, Chinese groups have been consistently securing the 1st place in 20, 21, and 22.

This year (2022) focuses on active speaker detection, which is different due to the use of audio.
1st place: Project Page, Code to be released.
2nd place by Intel: Paper

About AVA-Kinetics

Spatio-Temporal Action Localization Task
Research on state-of-the-art (SOTA): paperswithcode, List of AVA Challenge Winners
"RM" outperformed the ACAR-Net (CVPR 2021 paper) which is a winning model of AVA Action (ActivityNet Challenge 2020)
- RM: This model won the AVA-Kinetics 2021 Challenge. paper, but the GitHub code is not available.

About Homage

About Model

It seems meaningless to search for SOTA on paperswithcode as there are no notable models reported.
Official GitHub page provides detailed information about the dataset github
Let's check the challenge results. 2021 Results
The 2022 results seem unavailable as the challenge ended in June. Challenge Page on CodaLab

About Data

Dataset Paper link

My notes...

my blog
During the survey, I came across this: MIGS (BMVC 2021) GitHub.
- Although it doesn't use the Homage dataset, it is listed on paperswithcode.

My notes...