Consistent Mode-Finding for Parametric and Non-Parametric Clustering

Tobin, Joshua

dc.contributor.advisor	Zhang, Mimi	en
dc.contributor.author	Tobin, Joshua	en
dc.date.accessioned	2022-11-25T16:09:14Z
dc.date.available	2022-11-25T16:09:14Z
dc.date.issued	2022	en
dc.date.submitted	2022	en
dc.identifier.citation	Tobin, Joshua, Consistent Mode-Finding for Parametric and Non-Parametric Clustering, Trinity College Dublin, School of Computer Science & Statistics, Statistics, 2022	en
dc.identifier.other	Y	en
dc.description	APPROVED	en
dc.description.abstract	Density peaks clustering detects modes as points with high density and large distance to points of higher density. To cluster the observed samples, points are assigned to the same cluster as their nearest neighbor of higher density. This efficient and intuitive approach has, in recent years, grown in popularity in applications. Despite its widespread use, little work has been completed aiming at understanding the theoretical properties of the density peaks method, as well as its strengths and limitations when clustering. Here, we provide a detailed analysis of the density peaks clustering algorithm. We demonstrate that it recovers consistent estimates of the modes of the underlying density and correctly clusters the data with high probability. However, deficiencies of the density peaks clustering methodology are also highlighted. Noise in the density estimates can lead to errors when estimating modes and incoherent cluster assignments. Two adaptations of the density peaks clustering approach are proposed to remedy these issues. The first method seeks to detect modal sets rather than point modes in the data. This reduces the sensitivity of the clusterings to fluctuations in the density estimate. The second approach partitions the data into regions mutually separated by areas of low density, before applying the density peaks clustering algorithm. Doing so ensures that the result of the cluster assignment method meets the conceptual understanding of a correct clustering. Both approaches are analyzed theoretically and their superior performance is demonstrated on simulated and real-world datasets. Moreover, they are shown to be suitable for modern clustering applications in computer vision. Model-based clustering methods, where clusters are taken to be unimodal components in a finite mixture model, are then considered. Motivated by the consistent estimates of the modes provided by the density peaks clustering algorithm, a novel model-based clustering method is proposed. This approach uses a set of high density points as initial mean parameters, and iteratively prunes them to return a sequence of nested clusterings. The method outperforms popular model-based clustering methods. To conclude, the contributions of the thesis are used to motivate suggestions for future research.	en
dc.publisher	Trinity College Dublin. School of Computer Science & Statistics. Discipline of Statistics	en
dc.rights	Y	en
dc.subject	Density-Based Clustering	en
dc.subject	Face Recognition	en
dc.subject	Multi-Image Matching	en
dc.subject	Model-Based Clustering	en
dc.subject	Density Peaks Clustering	en
dc.subject	Clustering	en
dc.title	Consistent Mode-Finding for Parametric and Non-Parametric Clustering	en
dc.type	Thesis	en
dc.type.supercollection	thesis_dissertations	en
dc.type.supercollection	refereed_publications	en
dc.type.qualificationlevel	Doctoral	en
dc.identifier.peoplefinderurl	https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:TOBINJO	en
dc.identifier.rssinternalid	248368	en
dc.rights.ecaccessrights	openAccess
dc.contributor.sponsor	Government of Ireland	en
dc.identifier.uri	http://hdl.handle.net/2262/101725

Files in this item

Name:: Joshua Tobin PhD Thesis.pdf
Size:: 15.49Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.530Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Statistics (Theses and Dissertations)
Statistics (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Consistent Mode-Finding for Parametric and Non-Parametric Clustering

Files in this item

This item appears in the following Collection(s)

Related items

Forward-Stagewise Clustering: An Algorithm for Convex Clustering ﻿

Creative Clusters : Economic Analysis of the Current Status and Future Clustering Potential for the Crafts Industry in Ireland ﻿

Clusters in Ireland : the Irish dairy processing industry: an application of Porter's cluster analysis ﻿

Forward-Stagewise Clustering: An Algorithm for Convex Clustering

Creative Clusters : Economic Analysis of the Current Status and Future Clustering Potential for the Crafts Industry in Ireland

Clusters in Ireland : the Irish dairy processing industry: an application of Porter's cluster analysis