< Back to previous page

Publication

Clip-level feature aggregation : a key factor for video-based person re-identification

Book Contribution - Book Chapter Conference Contribution

In the task of video-based person re-identification, features of persons in the query and gallery sets are compared to search the best match. Generally, most existing methods aggregate the frame-level features together using a temporal method to generate the clip-level fea- tures, instead of the sequence-level representations. In this paper, we propose a new method that aggregates the clip-level features to obtain the sequence-level representations of persons, which consists of two parts, i.e., Average Aggregation Strategy (AAS) and Raw Feature Utilization (RFU). AAS makes use of all frames in a video sequence to generate a better representation of a person, while RFU investigates how batch normalization operation influences feature representations in person re- identification. The experimental results demonstrate that our method can boost the performance of existing models for better accuracy. In particular, we achieve 87.7% rank-1 and 82.3% mAP on MARS dataset without any post-processing procedure, which outperforms the existing state-of-the-art.
Book: Advanced concepts for intelligent vision systems - ACIVS 2020
Volume: 12002
Pages: 179 - 191
ISBN:9783030406059
Publication year:2020
Accessibility:Open