Show simple item record

dc.contributor.authorAhmad, Khurshid
dc.date.accessioned2023-10-30T12:19:10Z
dc.date.available2023-10-30T12:19:10Z
dc.date.issued2023
dc.date.submitted2023en
dc.identifier.citationCarl Vogel & Khurshid Ahmad, Agreement and disagreement between major emotion recognition systems, Knowledge-Based Systems, 2023, 276, 110579en
dc.identifier.otherY
dc.descriptionPUBLISHEDen
dc.description.abstractThe evaluation of systems that claim to recognize emotions expressed by human beings is a contested and complex task: The early pioneers in this field gave the impression that these systems will eventually recognize a flash of anger, suppressed glee/happiness, momentary disgust or contempt, lurking fear, or sadness in someone’s face or voice (Picard and Klein, 2002; Schuller et al., 2011). Emotion recognition systems are trained on ‘labelled’ databases — collection of video/audio recording comprising images and voices of humans enacting one emotional state. Machine learning programmes then regress the pixel distributions or wave forms against the labels. The system is then said to have learnt how to recognize and interpret human emotions and rated using information science metrics. These systems are adopted by the world at large for applications ranging from autistic spectrum communications to teaching and learning, and onwards to covert surveillance. The training databases depend upon human emotions recorded in ideal conditions — faces looking at the camera and centrally located, voices articulated through noise-cancelling microphones. Yet there are reports that the posed training data set, that is racially-skewed and gender unbalanced, does not prepare these systems to cope with data-in-the-wild and that expression-unrelated variations (like illumination, head pose, and identity bias (Li and Deng, 2020)) can impact their performance as well. Deployments of these systems tend to adopt one or other and apply it to data collected outside laboratory conditions and use the resulting classifications in subsequent processing. We have devised a testing method that helps to quantify the similarities and differences of facial emotion recognition systems (FER) and speech emotion recognition systems (SER). We report on the development of a data base comprising videos and sound track of 64 politicians and 7 government spokespersons (25 F, 46 M; 34 White Europeans, 19 East Asians, and 18 South Asians), ranging in age from 32–85 years, and each of the 71 has on average three 180 s videos; a total of 16.66 h of data. We have compared the performance of two FERs (Emotient and Affectiva) and two SERs (OpenSmile and Vokaturi) on our data by analysing emotions reported by these systems on a frame-by-frame basis. We have analysed the directly observable head movements, and the indirectly observable muscle movement parts of the face and for the muscle movements in the vocal tract. There was marked disagreement in emotions recognized, and the differences were exacerbated more women than for men, and more for South and East Asians than for White Europeans. Levels of agreement and disagreement on both high-level (i.e. emotion labels) and lower-level features (e.g. Euler angles of head movement) are shown. We show that inter-system disagreement may also be used as an effective response variable in reasoning about data features that influence disagreement. We argue that reliability of subsequent processing in approaches that adopt these systems may be enhanced by restricting action to cases where systems agree within a given tolerance level. This paper may be considered as a foray into the greater debate about the so-called algorithmic (un)fairness and data bias in the development and deployment of machine learning systems of which FERs and SERs are a good exemplar.en
dc.format.extent110579en
dc.language.isoenen
dc.language.isozhen
dc.relation.ispartofseriesKnowledge-Based Systems;
dc.relation.ispartofseries276;
dc.rightsYen
dc.subjectEmotion processingen
dc.subjectEmotion recognition systemsen
dc.subjectMulti-modal communication dataen
dc.titleAgreement and disagreement between major emotion recognition systemsen
dc.typeJournal Articleen
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/kahmad
dc.identifier.rssinternalid259738
dc.identifier.doihttps://doi.org/10.1016/j.knosys.2023.110759
dc.rights.ecaccessrightsopenAccess
dc.subject.TCDThemeNeuroscienceen
dc.subject.TCDTagEmotion processingEmotion recognition systemsMulti-modal communication data; machine learning, AIen
dc.identifier.rssurihttps://doi.org/10.1016/j.knosys.2023.110759
dc.identifier.orcid_id0000-0003-0234-5355
dc.status.accessibleNen
dc.identifier.urihttp://hdl.handle.net/2262/104074


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record