dc.contributor.advisor | Lacey, Gerard | |
dc.contributor.author | Bruton, Seán | |
dc.date.accessioned | 2021-03-01T15:36:10Z | |
dc.date.available | 2021-03-01T15:36:10Z | |
dc.date.issued | 2021 | en |
dc.date.submitted | 2021 | |
dc.identifier.citation | Bruton, Seán, Recognising the fine-grained actions of a goal-directed activity from multi-modal images, Trinity College Dublin.School of Computer Science & Statistics, 2021 | en |
dc.identifier.other | Y | en |
dc.description | APPROVED | en |
dc.description.abstract | The ability to understand and respond to human activities can form the basis of many pervasive computing applications. Recognising the constituent actions of an activity can lead to a more detailed understanding of the activity and provide opportunities to develop applications for monitoring, training and assistance. We address the specific problem of recognising the fine-grained actions of a fixed-setting goal-directed activity from RGB-D videos.
We design a novel convolutional neural network architecture, WeaveNet, for fine-grained action recognition from multiple image types. A spatio-temporal fusion method, Densely-Fused Action Images, is also presented for use in combination with WeaveNet. This combined architecture achieves an accuracy of 82.7\% at a mid-level granularity on a benchmark dataset, an improvement of 9\% over existing methods.
We contribute a system for recording fine-grained actions involved in human-object interaction tasks, specifically including clinical skills. The system is novel due to its ability to record actions from multiple viewpoints using RGB-D cameras in a synchronised way.
We present a dataset of clinical skill performances for the skill of venepuncture, including 60 performances, across 20 subjects, totalling over 15 hours of footage. The multi-modal, multi-camera characteristics of this dataset make it amenable to many fine-grained action recognition techniques.
Together, the fine-grained action recognition technique, the system for recoding human-object interactions, and the dataset of clinical skill performances, make a significant contribution towards the development of next-generation pervasive computing applications. | en |
dc.language.iso | en | en |
dc.publisher | Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science | en |
dc.rights | Y | en |
dc.subject | Action recognition | en |
dc.subject | Convolutional neural networks | en |
dc.subject | Multi-modal images | en |
dc.title | Recognising the fine-grained actions of a goal-directed activity from multi-modal images | en |
dc.type | Thesis | en |
dc.type.supercollection | thesis_dissertations | en |
dc.type.supercollection | refereed_publications | en |
dc.type.qualificationlevel | Doctoral | en |
dc.identifier.peoplefinderurl | https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:BRUTONS | en |
dc.identifier.rssinternalid | 224598 | en |
dc.rights.ecaccessrights | openAccess | |
dc.contributor.sponsor | Irish Research Council (IRC) | en |
dc.identifier.uri | http://hdl.handle.net/2262/95437 | |