Show simple item record

dc.contributor.authorMaldonado Guerra, Alfredo
dc.contributor.authorMoreau, Erwan
dc.contributor.authorVogel, Carl
dc.contributor.authorAlsulaimani, Ashjan
dc.contributor.authorHan, Lifeng
dc.contributor.authorChowdhury, Koel Dutta
dc.contributor.editorS. Markantonatou, C. Ramisch, A. Savary, V. Vinczeen
dc.coverage.temporal978-3-96110-123-8en
dc.date.accessioned2019-12-19T15:13:17Z
dc.date.available2019-12-19T15:13:17Z
dc.date.issued2018
dc.date.submitted2018en
dc.identifier.citationMoreau, E., Alsulaimani, A., Maldonado, A.G., Han, L., Vogel, C. & Chowdhury, K.D., Semantic reranking of CRF label sequences for verbal multiword expression identification, S. Markantonatou, C. Ramisch, A. Savary, V. Vincze, Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop, Language Science Press, 2018, 177 - 207en
dc.identifier.issn978-3-96110-124-5
dc.identifier.otherY
dc.descriptionPUBLISHEDen
dc.description.abstractVerbal multiword Expressions (VMWE) identification can be addressed successfully as a sequence labelling problem via conditional random fields (CRFs) by returning the one label sequence with maximal probability. This work describes a system that reranks the top 10 most likely CRF candidate VMWE sequences using a decision tree regression model. The reranker aims to operationalise the intuition that a non-compositional MWE can have a different distributional behaviour than that of its constituent words. This is why it uses semantic features based on comparing the context vector of a candidate expression against those of its constituent words. However, not all VMWE are non-compostional, and analysis shows that non-semantic features also play an important role in the behaviour of the reranker. In fact, the analysis shows that the combination of the sequential approach of the CRF component with the context-based approach of the reranker is the main factor of improvement: our reranker achieves a 12% macro-average F1-score improvement on the basic CRF method, as measured using data from PARSEME shared task on VMWE identification.en
dc.format.extent177en
dc.format.extent207en
dc.language.isoenen
dc.publisherLanguage Science Pressen
dc.relation.ispartofIsPartOfen
dc.relation.ispartofIsPartOfen
dc.relation.urihttp://langsci-press.org/catalog/book/204en
dc.rightsYen
dc.subjectVerbal multiword Expressionsen
dc.subjectConditional random fieldsen
dc.subjectNatural language processingen
dc.titleSemantic reranking of CRF label sequences for verbal multiword expression identificationen
dc.title.alternativeMultiword expressions at length and in depth: Extended papers from the MWE 2017 workshopen
dc.typeBook Chapteren
dc.type.supercollectionscholarly_publicationsen
dc.type.supercollectionrefereed_publicationsen
dc.identifier.peoplefinderurlhttp://people.tcd.ie/maldona
dc.identifier.peoplefinderurlhttp://people.tcd.ie/vogel
dc.identifier.peoplefinderurlhttp://people.tcd.ie/moreaue
dc.identifier.rssinternalid193091
dc.identifier.doihttp://dx.doi.org/10.5281/zenodo.1469559
dc.rights.ecaccessrightsopenAccess
dc.relation.doi10.5281/zenodo.1469527en
dc.subject.TCDThemeDigital Engagementen
dc.subject.TCDThemeDigital Humanitiesen
dc.subject.TCDTagARTIFICIAL INTELLIGENCEen
dc.subject.TCDTagComputational Linguisticsen
dc.subject.TCDTagDATA ANALYSISen
dc.subject.TCDTagMACHINE LEARNINGen
dc.subject.TCDTagNatural Language Processingen
dc.subject.TCDTagmulti-word expressionsen
dc.subject.TCDTagtext analyticsen
dc.identifier.rssurihttp://langsci-press.org/catalog/view/204/1647/1302-1
dc.identifier.orcid_id0000-0001-8426-5249
dc.status.accessibleNen
dc.contributor.sponsorScience Foundation Ireland (SFI)en
dc.contributor.sponsorGrantNumber13/RC/2106en
dc.identifier.urihttp://hdl.handle.net/2262/91208


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record