Recognising Actions for Instructional Training using Pose Information: A Comparative Evaluation.
Citation:
Bruton, S., Lacey, G., Recognising Actions for Instructional Training using Pose Information: A Comparative Evaluation., 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 25-27 Feb 2019, Prague, Czech RepublicAbstract:
Humans perform many complex tasks involving the manipulation of multiple objects. Recognition of the constituent actions of these tasks can be used to drive instructional training systems. The identities and poses of the objects used during such tasks are salient for the purposes of recognition. In this work, 3D object detection and registration techniques are used to identify and track objects involved in an everyday task of preparing a cup of tea. The pose information serves as input to an action classification system that uses Long-Short Term Memory (LSTM) recurrent neural networks as part of a deep architecture. An advantage of this approach is that it can represent the complex dynamics of object and human poses at hierarchical levels without the need for design of specific spatio-temporal features. By using such compact features, we demonstrate the feasibility of using the hyperparameter optimisation technique of Tree-Parzen Estimators to identify optimal hyperparameters as well as network architectures. The results of 83% recognition show that this approach is viable for similar scenarios of pervasive computing applications where prior scene knowledge exists.
Sponsor
Grant Number
IRCSET (OK)
Author's Homepage:
http://people.tcd.ie/gjlacey
Author: Lacey, Gerard
Sponsor:
IRCSET (OK)Other Titles:
14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.Type of material:
Conference PaperCollections
Availability:
Full text availableKeywords:
Action Recognition, Deep Learning, Pose EstimationMetadata
Show full item recordLicences: