Detecting restriction class correspondences in Linked Open Data

Walshe, Brian

dc.contributor.advisor	O'Sullivan, Declan
dc.contributor.author	Walshe, Brian
dc.date.accessioned	2016-11-07T16:30:20Z
dc.date.available	2016-11-07T16:30:20Z
dc.date.issued	2014
dc.identifier.citation	Brian Walshe, 'Detecting restriction class correspondences in Linked Open Data', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2014, pp 154
dc.identifier.other	THESIS 10532
dc.description.abstract	The Linked Open Data (LOD) project has made a broad range of knowledge available on the World Wide Web as open datasets, using a common format and linked in such a manner that it is a simple task to explore and integrate data from different sources. The links between the datasets are one to one relationships between named URIs, but these simple links are not always sufficient to describe many relationships between data in the sets. Sometimes it is more appropriate to use a complex correspondence to describe the relationship –a correspondence which involves one or more entities in a logical formulation. This thesis discusses an approach to detecting complex correspondences between ontologies associated with LOD. There are several challenges associated with this task. First, the datasets can be very large and the number of potential complex correspondences is a combinatorial function of their size. Secondly, many of the datasets in the LOD cloud may have large amounts of missing information or invalid entries. Our approach focuses on detecting correspondences between named classes and logically constructed restriction classes. An extensional approach– that is, one which uses instance data shared by the datasets –is employed to discover appropriate restriction classes for the correspondences. Unlike other extensional approaches, ours focuses on using methods which will perform well with only a small number of example instances– as finding matched instances is not usually a trivial task. To evaluate the approach we first demonstrate that it can be used to detect complex correspondences between the DBpedia and YAGO2 datasets. We then show that a small sample of 15 matched instances can be as effective as using a sample in the order of 10 4 instances. Further to this we show that as the level of missing data increases a selection metric which takes the open world assumption into account can consistently outperform the Information Gain metric, popular in the field of machine learning. Finally we compare our approach to a leading extensional approach to detecting complex correspondences, and show that our approach can produce more accurate correspondences, while using significantly less input data. The primary contribution of this thesis is a robust, scalable approach to detecting complex correspondences between classes in ontologies. LOD ontologies were the primary motivation behind this work, but the approach could be applied to any ontologies which contain individuals which can be mapped directly to one another with equivalence relationships.
dc.format	1 volume
dc.language.iso	en
dc.publisher	Trinity College (Dublin, Ireland). School of Computer Science & Statistics
dc.relation.isversionof	http://stella.catalogue.tcd.ie/iii/encore/record/C__Rb16100712
dc.subject	Computer Science, Ph.D.
dc.subject	Ph.D. Trinity College Dublin
dc.title	Detecting restriction class correspondences in Linked Open Data
dc.type	thesis
dc.type.supercollection	refereed_publications
dc.type.supercollection	thesis_dissertations
dc.type.qualificationlevel	Doctoral
dc.type.qualificationname	Doctor of Philosophy (Ph.D.)
dc.rights.ecaccessrights	openAccess
dc.format.extentpagination	pp 154
dc.description.note	TARA (Trinity's Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ie
dc.identifier.uri	http://hdl.handle.net/2262/77672