Show simple item record

dc.contributor.authorGreene, Derek
dc.contributor.authorCunningham, Padraig
dc.date.accessioned2006-10-26T16:04:55Z
dc.date.available2006-10-26T16:04:55Z
dc.date.issued2006-08-18
dc.identifier.citationEfficient Ensemble Methods for Document Clustering, Derek Greene and Padraig Cunningham, 18 August 2006,TCD-CS-2006-48en
dc.description.abstractRecent ensemble clustering techniques have been shown to be effective in improving the accuracy and stability of standard clustering algorithms. However, an inherent drawback of these techniques is the computational cost of generating and combining multiple clusterings of the data. In this paper, we present an efficient kernel-based ensemble clustering method suitable for application to large, high-dimensional datasets such as text corpora. To decrease the time required to generate the ensemble members, we employ a prototype reduction scheme that makes use of a density-biased selection strategy to construct a smaller kernel matrix that represents a good proxy for the original data. Evaluations performed on text data demonstrate that this process leads to a significant decrease in running time, while maintaining high clustering accuracy.en
dc.format.extent198433 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherDepartment of Computer Science, Trinity College Dublinen
dc.relation.ispartofseriesComputer Science Department, Technical Reporten
dc.relation.ispartofseriesTCD-CS-2006-48en
dc.subjectDocument clusteringen
dc.titleEfficient Ensemble Methods for Document Clusteringen
dc.typeTechnical Reporten
dc.identifier.rssurihttps://www.cs.tcd.ie/publications/tech-reports/reports.06/TCD-CS-2006-48.pdf
dc.identifier.urihttp://hdl.handle.net/2262/2418


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record