Recognising the fine-grained actions of a goal-directed activity from multi-modal images

Bruton, Seán

dc.contributor.advisor	Lacey, Gerard
dc.contributor.author	Bruton, Seán
dc.date.accessioned	2021-03-01T15:36:10Z
dc.date.available	2021-03-01T15:36:10Z
dc.date.issued	2021	en
dc.date.submitted	2021
dc.identifier.citation	Bruton, Seán, Recognising the fine-grained actions of a goal-directed activity from multi-modal images, Trinity College Dublin.School of Computer Science & Statistics, 2021	en
dc.identifier.other	Y	en
dc.description	APPROVED	en
dc.description.abstract	The ability to understand and respond to human activities can form the basis of many pervasive computing applications. Recognising the constituent actions of an activity can lead to a more detailed understanding of the activity and provide opportunities to develop applications for monitoring, training and assistance. We address the specific problem of recognising the fine-grained actions of a fixed-setting goal-directed activity from RGB-D videos. We design a novel convolutional neural network architecture, WeaveNet, for fine-grained action recognition from multiple image types. A spatio-temporal fusion method, Densely-Fused Action Images, is also presented for use in combination with WeaveNet. This combined architecture achieves an accuracy of 82.7\% at a mid-level granularity on a benchmark dataset, an improvement of 9\% over existing methods. We contribute a system for recording fine-grained actions involved in human-object interaction tasks, specifically including clinical skills. The system is novel due to its ability to record actions from multiple viewpoints using RGB-D cameras in a synchronised way. We present a dataset of clinical skill performances for the skill of venepuncture, including 60 performances, across 20 subjects, totalling over 15 hours of footage. The multi-modal, multi-camera characteristics of this dataset make it amenable to many fine-grained action recognition techniques. Together, the fine-grained action recognition technique, the system for recoding human-object interactions, and the dataset of clinical skill performances, make a significant contribution towards the development of next-generation pervasive computing applications.	en
dc.language.iso	en	en
dc.publisher	Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science	en
dc.rights	Y	en
dc.subject	Action recognition	en
dc.subject	Convolutional neural networks	en
dc.subject	Multi-modal images	en
dc.title	Recognising the fine-grained actions of a goal-directed activity from multi-modal images	en
dc.type	Thesis	en
dc.type.supercollection	thesis_dissertations	en
dc.type.supercollection	refereed_publications	en
dc.type.qualificationlevel	Doctoral	en
dc.identifier.peoplefinderurl	https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:BRUTONS	en
dc.identifier.rssinternalid	224598	en
dc.rights.ecaccessrights	openAccess
dc.contributor.sponsor	Irish Research Council (IRC)	en
dc.identifier.uri	http://hdl.handle.net/2262/95437

Files in this item

Name:: sean_bruton_thesis_20210111_tw ...
Size:: 37.11Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.530Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Theses and Dissertations)
Computer Science (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Recognising the fine-grained actions of a goal-directed activity from multi-modal images

Files in this item

This item appears in the following Collection(s)