Active learning query selection with historical information

Davy, Michael

dc.contributor.advisor	Luz, Saturnino
dc.contributor.author	Davy, Michael
dc.date.accessioned	2016-11-07T14:19:55Z
dc.date.available	2016-11-07T14:19:55Z
dc.date.issued	2009
dc.identifier.citation	Michael Davy, 'Active learning query selection with historical information', [thesis], Trinity College (Dublin, Ireland). School of Computer Science & Statistics, 2009, pp 193
dc.identifier.other	THESIS 8741
dc.description.abstract	This work describes novel methods and techniques to decrease the cost of employing active learning in text categorisation problems. The cost of performing active learning is a combination of labelling effort and computational overhead. Reducing the cost of active learning allows for accurate classifiers to be constructed inexpensively, increasing the number of realworld problems where machine learning solutions can be successfully applied. In this thesis we investigate strategies and techniques to reduce both computational expense and labelling effort in active learning. Critical to the success of active learning is the query selection strategy, which is responsible for identifying informative unlabelled examples. Selecting only the most informative examples will reduce labelling effort as redundant and uninformative examples are ignored. The majority of query selection strategies select queries based on the labelling predictions of the current classifier. This thesis suggests that information from prior iterations of active learning can help select more informative queries in the current iteration. We propose History-based query selection strategies, which incorporate predictions from prior iterations of active learning into the selection of the current query. These strategies have been shown to increase the accuracy of classifiers produced using active learning, thereby reducing labelling effort. In addition, History-based query selection strategies are very efficient since information is reused from previous iterations of active learning. Another contributing factor to the cost of active learning is computational expense. Query selection strategies can require considerable computation to identify the most informative examples. We investigate pre-filtering optimisation for the computationally inefficient error reduction sampling (ERS) query selection strategy. Pre-filtering restricts the number of unlabelled examples considered to a small subset of the pool, constructed using query selection strategy. Optimising ERS using pre-filtering was found to simultaneously reduce computational overhead and the labelling effort.
dc.format	1 volume
dc.language.iso	en
dc.publisher	Trinity College (Dublin, Ireland). School of Computer Science & Statistics
dc.relation.isversionof	http://stella.catalogue.tcd.ie/iii/encore/record/C__Rb13908290
dc.subject	Computer Science, Ph.D.
dc.subject	Ph.D. Trinity College Dublin
dc.title	Active learning query selection with historical information
dc.type	thesis
dc.type.supercollection	refereed_publications
dc.type.supercollection	thesis_dissertations
dc.type.qualificationlevel	Doctoral
dc.type.qualificationname	Doctor of Philosophy (Ph.D.)
dc.rights.ecaccessrights	openAccess
dc.format.extentpagination	pp 193
dc.description.note	TARA (Trinity's Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ie
dc.identifier.uri	http://hdl.handle.net/2262/77610

Files in this item

Name:: Davy, Michael_TCD-SCSS-PHD-200 ...
Size:: 1.921Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.419Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Computer Science (PhD Theses)
Computer Science (Theses and Dissertations)
Computer Science (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Active learning query selection with historical information

Files in this item

This item appears in the following Collection(s)