Agreement and disagreement between major emotion recognition systems

Ahmad, Khurshid

dc.contributor.author	Ahmad, Khurshid
dc.date.accessioned	2023-10-30T12:19:10Z
dc.date.available	2023-10-30T12:19:10Z
dc.date.issued	2023
dc.date.submitted	2023	en
dc.identifier.citation	Carl Vogel & Khurshid Ahmad, Agreement and disagreement between major emotion recognition systems, Knowledge-Based Systems, 2023, 276, 110579	en
dc.identifier.other	Y
dc.description	PUBLISHED	en
dc.description.abstract	The evaluation of systems that claim to recognize emotions expressed by human beings is a contested and complex task: The early pioneers in this field gave the impression that these systems will eventually recognize a flash of anger, suppressed glee/happiness, momentary disgust or contempt, lurking fear, or sadness in someone’s face or voice (Picard and Klein, 2002; Schuller et al., 2011). Emotion recognition systems are trained on ‘labelled’ databases — collection of video/audio recording comprising images and voices of humans enacting one emotional state. Machine learning programmes then regress the pixel distributions or wave forms against the labels. The system is then said to have learnt how to recognize and interpret human emotions and rated using information science metrics. These systems are adopted by the world at large for applications ranging from autistic spectrum communications to teaching and learning, and onwards to covert surveillance. The training databases depend upon human emotions recorded in ideal conditions — faces looking at the camera and centrally located, voices articulated through noise-cancelling microphones. Yet there are reports that the posed training data set, that is racially-skewed and gender unbalanced, does not prepare these systems to cope with data-in-the-wild and that expression-unrelated variations (like illumination, head pose, and identity bias (Li and Deng, 2020)) can impact their performance as well. Deployments of these systems tend to adopt one or other and apply it to data collected outside laboratory conditions and use the resulting classifications in subsequent processing. We have devised a testing method that helps to quantify the similarities and differences of facial emotion recognition systems (FER) and speech emotion recognition systems (SER). We report on the development of a data base comprising videos and sound track of 64 politicians and 7 government spokespersons (25 F, 46 M; 34 White Europeans, 19 East Asians, and 18 South Asians), ranging in age from 32–85 years, and each of the 71 has on average three 180 s videos; a total of 16.66 h of data. We have compared the performance of two FERs (Emotient and Affectiva) and two SERs (OpenSmile and Vokaturi) on our data by analysing emotions reported by these systems on a frame-by-frame basis. We have analysed the directly observable head movements, and the indirectly observable muscle movement parts of the face and for the muscle movements in the vocal tract. There was marked disagreement in emotions recognized, and the differences were exacerbated more women than for men, and more for South and East Asians than for White Europeans. Levels of agreement and disagreement on both high-level (i.e. emotion labels) and lower-level features (e.g. Euler angles of head movement) are shown. We show that inter-system disagreement may also be used as an effective response variable in reasoning about data features that influence disagreement. We argue that reliability of subsequent processing in approaches that adopt these systems may be enhanced by restricting action to cases where systems agree within a given tolerance level. This paper may be considered as a foray into the greater debate about the so-called algorithmic (un)fairness and data bias in the development and deployment of machine learning systems of which FERs and SERs are a good exemplar.	en
dc.format.extent	110579	en
dc.language.iso	en	en
dc.language.iso	zh	en
dc.relation.ispartofseries	Knowledge-Based Systems;
dc.relation.ispartofseries	276;
dc.rights	Y	en
dc.subject	Emotion processing	en
dc.subject	Emotion recognition systems	en
dc.subject	Multi-modal communication data	en
dc.title	Agreement and disagreement between major emotion recognition systems	en
dc.type	Journal Article	en
dc.type.supercollection	scholarly_publications	en
dc.type.supercollection	refereed_publications	en
dc.identifier.peoplefinderurl	http://people.tcd.ie/kahmad
dc.identifier.rssinternalid	259738
dc.identifier.doi	https://doi.org/10.1016/j.knosys.2023.110759
dc.rights.ecaccessrights	openAccess
dc.subject.TCDTheme	Neuroscience	en
dc.subject.TCDTag	Emotion processingEmotion recognition systemsMulti-modal communication data; machine learning, AI	en
dc.identifier.rssuri	https://doi.org/10.1016/j.knosys.2023.110759
dc.identifier.orcid_id	0000-0003-0234-5355
dc.status.accessible	N	en
dc.identifier.uri	http://hdl.handle.net/2262/104074

Files in this item

Name:: Agreement-and-disagreement-bet ...
Size:: 459.5Kb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.463Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Scholarly Publications)
Computer Science (Scholarly Publications)
RSS Feeds

Show simple item record

Browse

My Account

Agreement and disagreement between major emotion recognition systems

Files in this item

This item appears in the following Collection(s)