Skip to main content

Beyond to Action Recognition; A Survey on Advanced Video Understanding

· 2 min read
SeulGi Hong

Goal of this post is to investigate Action Detection datasets such as AVA Action and AVA-Kinetics, as well as the Homage Dataset.

내용

  1. Explore ActivityNet Challenges
  • Conducted initial research to grasp the trends by examining various datasets and tasks related to video understanding.
  • findings: I Decided to look into AVA-Kinetics and Homage.

By the way, since 2019 that SlowFast won ActivityNet Challenges, Chinese groups have been consistently securing the 1st place in 20, 21, and 22.

  • This year (2022) focuses on active speaker detection, which is different due to the use of audio.
  • 1st place: Project Page, Code to be released.
  • 2nd place by Intel: Paper
  1. About AVA-Kinetics
  • Spatio-Temporal Action Localization Task
  • Research on state-of-the-art (SOTA): paperswithcode, List of AVA Challenge Winners
  • "RM" outperformed the ACAR-Net (CVPR 2021 paper) which is a winning model of AVA Action (ActivityNet Challenge 2020)
    • RM: This model won the AVA-Kinetics 2021 Challenge. paper, but the GitHub code is not available.

image

  1. About Homage

About Model

  • It seems meaningless to search for SOTA on paperswithcode as there are no notable models reported.
  • Official GitHub page provides detailed information about the dataset github
  • Let's check the challenge results. 2021 Results
  • The 2022 results seem unavailable as the challenge ended in June. Challenge Page on CodaLab

About Data

My notes...

  • my blog
  • During the survey, I came across this: MIGS (BMVC 2021) GitHub.
    • Although it doesn't use the Homage dataset, it is listed on paperswithcode.