dc.contributor.advisor | Zhang, Mimi | en |
dc.contributor.author | Tobin, Joshua | en |
dc.date.accessioned | 2022-11-25T16:09:14Z | |
dc.date.available | 2022-11-25T16:09:14Z | |
dc.date.issued | 2022 | en |
dc.date.submitted | 2022 | en |
dc.identifier.citation | Tobin, Joshua, Consistent Mode-Finding for Parametric and Non-Parametric Clustering, Trinity College Dublin, School of Computer Science & Statistics, Statistics, 2022 | en |
dc.identifier.other | Y | en |
dc.description | APPROVED | en |
dc.description.abstract | Density peaks clustering detects modes as points with high density and large distance to points of higher density. To cluster the observed samples, points are assigned to the same cluster as their nearest neighbor of higher density. This efficient and intuitive approach has, in recent years, grown in popularity in applications. Despite its widespread use, little work has been completed aiming at understanding the theoretical properties of the density peaks method, as well as its strengths and limitations when clustering. Here, we provide a detailed analysis of the density peaks clustering algorithm. We demonstrate that it recovers consistent estimates of the modes of the underlying density and correctly clusters the data with high probability. However, deficiencies of the density peaks clustering methodology are also highlighted. Noise in the density estimates can lead to errors when estimating modes and incoherent cluster assignments. Two adaptations of the density peaks clustering approach are proposed to remedy these issues. The first method seeks to detect modal sets rather than point modes in the data. This reduces the sensitivity of the clusterings to fluctuations in the density estimate. The second approach partitions the data into regions mutually separated by areas of low density, before applying the density peaks clustering algorithm. Doing so ensures that the result of the cluster assignment method meets the conceptual understanding of a correct clustering. Both approaches are analyzed theoretically and their superior performance is demonstrated on simulated and real-world datasets. Moreover, they are shown to be suitable for modern clustering applications in computer vision. Model-based clustering methods, where clusters are taken to be unimodal components in a finite mixture model, are then considered. Motivated by the consistent estimates of the modes provided by the density peaks clustering algorithm, a novel model-based clustering method is proposed. This approach uses a set of high density points as initial mean parameters, and iteratively prunes them to return a sequence of nested clusterings. The method outperforms popular model-based clustering methods. To conclude, the contributions of the thesis are used to motivate suggestions for future research. | en |
dc.publisher | Trinity College Dublin. School of Computer Science & Statistics. Discipline of Statistics | en |
dc.rights | Y | en |
dc.subject | Density-Based Clustering | en |
dc.subject | Face Recognition | en |
dc.subject | Multi-Image Matching | en |
dc.subject | Model-Based Clustering | en |
dc.subject | Density Peaks Clustering | en |
dc.subject | Clustering | en |
dc.title | Consistent Mode-Finding for Parametric and Non-Parametric Clustering | en |
dc.type | Thesis | en |
dc.type.supercollection | thesis_dissertations | en |
dc.type.supercollection | refereed_publications | en |
dc.type.qualificationlevel | Doctoral | en |
dc.identifier.peoplefinderurl | https://tcdlocalportal.tcd.ie/pls/EnterApex/f?p=800:71:0::::P71_USERNAME:TOBINJO | en |
dc.identifier.rssinternalid | 248368 | en |
dc.rights.ecaccessrights | openAccess | |
dc.contributor.sponsor | Government of Ireland | en |
dc.identifier.uri | http://hdl.handle.net/2262/101725 | |