Show simple item record

dc.contributor.authorMaldonado Guerra, Alfredo
dc.contributor.authorQasemiZadeh, Behrang
dc.contributor.editorS. Markantonatou, C. Ramisch, A. Savary, V. Vinczeen
dc.coverage.temporal978-3-96110-123-8en
dc.date.accessioned2019-12-19T15:16:34Z
dc.date.available2019-12-19T15:16:34Z
dc.date.issued2018
dc.date.submitted2018en
dc.identifier.citationMaldonado, A. & QasemiZadeh, B., Analysis and Insights from the PARSEME Shared Task dataset, S. Markantonatou, C. Ramisch, A. Savary, V. Vincze, Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop, Language Science Press, 2018, 149 - 176en
dc.identifier.issn978-3-96110-124-5
dc.identifier.otherY
dc.descriptionPUBLISHEDen
dc.description.abstractThe PARSEME Shared Task on the automatic identification of verbal multiword expressions (VMWEs) was the first collaborative study on the subject to cover a wide and diverse range of languages. One observation that emerged from the official results is that participating systems performed similarly on each language but differently across languages. That is, intra-language evaluation scores are relatively similar whereas inter-language scores are quite different. We hypothesise that this pattern cannot be attributed solely to the intrinsic linguistic properties in each language corpus, but also to more practical aspects such as the evaluation framework, characteristics of the test and training sets as well as metrics used for measuring performance. This chapter takes a close look at the shared task dataset and the systems’ output to explain this pattern. In this process, we produce evaluation results for the systems on VMWEs that only appear in the test set and contrast them with the official evaluation results, which include VMWEs that also occur in the training set. Additionally, we conduct an analysis aimed at estimating the relative difficulty of VMWE detection for each language. This analysis consists of a) assessing the impact on performance of the ability, or lack-thereof, of systems to handle discontinuous and overlapped VMWEs, b) measuring the relative sparsity of sentences with at least one VMWE, and c) interpreting the performance of each system with respect to two baseline systems: a system that simply tags every verb as a VMWE, and a dictionary lookup system. Based on our data analysis, we assess the suitability of the official evaluation methods, specifically the token-based method, and propose to use Cohen’s kappa score as an additional evaluation method.en
dc.format.extent149en
dc.format.extent176en
dc.language.isoenen
dc.publisherLanguage Science Pressen
dc.relation.ispartofIsPartOfen
dc.relation.ispartofIsPartOfen
dc.relation.urihttp://langsci-press.org/catalog/book/204en
dc.rightsYen
dc.subjectVerbal multiword expressionsen
dc.subjectPARSEMEen
dc.subjectLanguagesen
dc.titleAnalysis and Insights from the PARSEME Shared Task dataseten
dc.title.alternativeMultiword expressions at length and in depth: Extended papers from the MWE 2017 workshopen
dc.typeBook Chapteren
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/maldona
dc.identifier.rssinternalid193089
dc.identifier.doiDOI:10.5281/zenodo.1469557
dc.rights.ecaccessrightsopenAccess
dc.relation.doi10.5281/zenodo.1469527en
dc.subject.TCDTagData Analysisen
dc.subject.TCDTagNatural Language Processingen
dc.identifier.rssurihttp://langsci-press.org/catalog/view/204/1345/1301-1
dc.identifier.orcid_id0000-0001-8426-5249
dc.status.accessibleNen
dc.contributor.sponsorScience Foundation Ireland (SFI)en
dc.contributor.sponsorGrantNumber13/RC/2106en
dc.identifier.urihttp://hdl.handle.net/2262/91209


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record