Abstract
This method introduces an efficient manner of learning action categories without the need of feature estimation. The approach starts from low-level values, in a similar style to the successful CNN methods. However, rather than extracting general image features, we learn to predict specific video representations from raw video data. The benefit of such an approach is that at the same computational expense it can predict 2D video representations as well as 3D ones, based on motion. The proposed model relies on discriminative Wald-boost, which we enhance to a multiclass formulation for the purpose of learning video representations. The suitability of the proposed approach as well as its time efficiency are tested on the UCF11 action recognition dataset.
Original language | English |
---|---|
Title of host publication | 2016 IEEE International Conference on Image Processing (ICIP) |
Subtitle of host publication | Proceedings |
Place of Publication | Piscataway |
Publisher | IEEE |
Pages | 196-200 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-4673-9961-6 |
ISBN (Print) | 978-1-4673-9962-3 |
DOIs | |
Publication status | Published - 2016 |
Event | 2016 IEEE International Conference on Image Processing (ICIP) - Phoenix, United States Duration: 25 Sept 2016 → 28 Sept 2016 |
Conference
Conference | 2016 IEEE International Conference on Image Processing (ICIP) |
---|---|
Country/Territory | United States |
City | Phoenix |
Period | 25/09/16 → 28/09/16 |
Keywords
- Multiclass Waldboost
- video representations
- action recognition
- feature learning