Publicatie

Temporal Segment Networks for Action Recognition in Videos

Tijdschriftbijdrage - Tijdschriftartikel

We present a general and flexible video-level framework for learning action models in videos. This method, called temporal segment network (TSN), aims to model long-range temporal structure with a new segment-based sampling and aggregation scheme. This unique design enables the TSN framework to efficiently learn action models by using the whole video. The learned models could be easily deployed for action recognition in both trimmed and untrimmed videos with simple average pooling and multi-scale temporal window integration, respectively. We also study a series of good practices for the implementation of the TSN framework given limited training samples. Our approach obtains the state-the-of-art performance on five challenging action recognition benchmarks: HMDB51 (71.0 percent), UCF101 (94.9 percent), THUMOS14 (80.1 percent), ActivityNet v1.2 (89.6 percent), and Kinetics400 (75.7 percent). In addition, using the proposed RGB difference as a simple motion representation, our method can still achieve competitive accuracy on UCF101 (91.0 percent) while running at 340 FPS. Furthermore, based on the proposed TSN framework, we won the video classification track at the ActivityNet challenge 2016 among 24 teams.

Tijdschrift: IEEE Transactions on Pattern Analysis and Machine Intelligence

ISSN: 0162-8828

Issue: 11

Volume: 41

Pagina's: 2740 - 2755

Jaar van publicatie:2019

Institutional Repository URL: https://lirias.kuleuven.be/2242199
DOI: https://doi.org/10.1109/tpami.2018.2868668
Scopus Id: 2-s2.0-85052804139
PubMed Id: 30183621
WoS Id: 000489838200013

Publicatie

Temporal Segment Networks for Action Recognition in Videos

Tijdschriftbijdrage - Tijdschriftartikel

Auteurs/uitgever

Onderzoekseenheden