Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems

BAYOMI, MOSTAFA

dc.contributor.advisor	Lawless, Seamus	en
dc.contributor.author	BAYOMI, MOSTAFA	en
dc.date.accessioned	2019-03-15T16:25:50Z
dc.date.available	2019-03-15T16:25:50Z
dc.date.issued	2019	en
dc.date.submitted	2019	en
dc.identifier.citation	BAYOMI, MOSTAFA MOHAMED, Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems, Trinity College Dublin.School of Computer Science & Statistics, 2019	en
dc.identifier.other	Y	en
dc.description	APPROVED	en
dc.description.abstract	The volume of digital content resources written as text documents is growing every day, at an unprecedented rate. Because this content is generally not structured as easy-to-handle units, it can be very difficult for users to find information they are interested in, or to help them accomplish their tasks. This in turn has increased the need for producing tailored content that can be adapted to the needs of individual users. A key challenge for producing such tailored content lies in the ability to understand how this content is structured. Hence, the efficient analysis and understanding of unstructured text content has become increasingly important. This has led to the increasing use of Natural Language Processing (NLP) techniques to help with processing unstructured text documents. Amongst the different NLP techniques, Text Segmentation is specifically used to understand the structure of textual documents. However, current approaches to text segmentation are typically based upon using lexical and/or syntactic representation to build a structure from the unstructured text documents. However, the relationship between segments may be semantic, rather than lexical or syntactic. Furthermore, text segmentation research has primarily focused on techniques that can be used to process text documents but not on how these techniques can be utilised to produce tailored content that can be adapted to the needs of individual users. In contrast, the field of Adaptive Systems has inherently focused on the challenges associated with dynamically adapting and delivering content to individual users. However, adaptive systems have primarily focused upon the techniques of adapting content, not on how to understand and structure this content. Even systems that have focused on structuring content are limited in that they rely upon the original structure of the content resource, which reflects the perspective of its author. Therefore, these systems are limited in that they do not deeply ?understand? the structure of the content, which in turn, limits their capability to discover and supply appropriate content for use in defined contexts, and limits the content?s amenability for reuse within various independent adaptive systems. In order to utilise the strength of NLP techniques to overcome the challenges of understanding unstructured text content, this thesis investigates how NLP techniques can be utilised in order to enhance the supply of content to adaptive systems. Specifically, the contribution of this thesis is concerned with addressing the challenges associated with hierarchical text segmentation techniques, and with content discoverability and reusability for adaptive systems. Firstly, this research proposes a novel hierarchical text segmentation approach, named C-HTS, that builds a structure from text documents based on the semantic representation of text. Semantic representation is a method that replaces keyword-based text representation with concept-based features, where the meaning of a piece of text is represented as a vector of knowledge concepts automatically extracted from massive human knowledge repositories such as Wikipedia. Using this approach, C-HTS represents the content of a document as a tree-like hierarchy. This way of structuring the document can be regarded as a hierarchically coherent tree that is useful for supporting a variety of search methods as it provides different levels of granularity for the underlying content. Secondly, this research proposes a novel content-supply service named CROCC. The aim of CROCC is to utilise the produced structure of C-HTS in order to overcome the limitations of the state of the art content-supply approaches. Finally, this research conducts an evaluation of the extent to which the CROCC service enhances content discoverability and reusability for adaptive systems.	en
dc.publisher	Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science	en
dc.rights	Y	en
dc.subject	Natural Language Processing, Text Segmentation, Semantic Analysis, Adaptive Systems	en
dc.title	Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems	en
dc.type	Thesis	en
dc.type.supercollection	thesis_dissertations	en
dc.type.supercollection	refereed_publications	en
dc.type.qualificationlevel	Doctoral	en
dc.identifier.peoplefinderurl	https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:BAYOMIM	en
dc.identifier.rssinternalid	199686	en
dc.rights.ecaccessrights	openAccess
dc.contributor.sponsor	SFI stipend	en
dc.identifier.uri	http://hdl.handle.net/2262/86075

Files in this item

Name:: Mostafa_Bayomi_Thesis_2019.pdf
Size:: 3.881Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.499Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (Theses and Dissertations)
Computer Science (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems

Files in this item

This item appears in the following Collection(s)