Show simple item record

dc.contributor.authorGraham, Yvetteen
dc.date.accessioned2021-03-29T09:23:04Z
dc.date.available2021-03-29T09:23:04Z
dc.date.created16/11/20en
dc.date.issued2020en
dc.date.submitted2020en
dc.identifier.citationYvette Graham, Christian Federmann, Maria Eskevich, Barry Haddow, Assessing Human-Parity in Machine Translation on the Segment Level, Findings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP - Findings), Virtual, 16/11/20, Association for Computational Linguistics, 2020, 4199 - 4207en
dc.identifier.otherYen
dc.descriptionPUBLISHEDen
dc.descriptionVirtualen
dc.description.abstractRecent machine translation shared tasks have shown top-performing systems to tie or in some cases even outperform human translation. Such conclusions about system and human performance are, however, based on estimates aggregated from scores collected over large test sets of translations and unfortunately leave some remaining questions unanswered. For instance, simply because a system significantly outperforms the human translator on average may not necessarily mean that it has done so for every translation in the test set. Firstly, are there remaining source segments present in evaluation test sets that cause significant challenges for top-performing systems and can such challenging segments go unnoticed due to the opacity of current human evaluation procedures? To provide insight into these issues we carefully inspect the outputs of top-performing systems in the most recent WMT-19 news translation shared task for all language pairs in which a system either tied or outperformed human translation. Our analysis provides a new method of identifying the remaining segments for which either machine or human perform poorly. For example, in our close inspection of WMT-19 English to German and German to English we discover the segments that disjointly proved a challenge for human and machine. For English to Russian, there were no segments included in our sample of translations that caused a significant challenge for the human translator, while we again identify the set of segments that caused issues for the top-performing system.en
dc.format.extent4199en
dc.format.extent4207en
dc.language.isoenen
dc.publisherAssociation for Computational Linguisticsen
dc.rightsYen
dc.titleAssessing Human-Parity in Machine Translation on the Segment Levelen
dc.title.alternativeFindings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP - Findings)en
dc.typeConference Paperen
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/ygrahamen
dc.identifier.rssinternalid226534en
dc.identifier.doihttp://dx.doi.org/10.18653/v1/2020.findings-emnlp.375en
dc.rights.ecaccessrightsopenAccess
dc.subject.TCDThemeInternational Integrationen
dc.subject.TCDTagARTIFICIAL INTELLIGENCEen
dc.subject.TCDTagMachine Translationen
dc.subject.TCDTagNatural Language Processingen
dc.identifier.rssurihttps://www.aclweb.org/anthology/2020.findings-emnlp.375.pdfen
dc.identifier.orcid_id0000-0001-6741-4855en
dc.subject.darat_thematicCommunicationen
dc.status.accessibleNen
dc.contributor.sponsorSFI stipenden
dc.contributor.sponsorGrantNumber13/RC/2106en
dc.identifier.urihttp://hdl.handle.net/2262/95918


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record