Tools for analysing the voice : developments in glottal source and quality analysis

Kane, John

dc.contributor.advisor	Gobl, Christer
dc.contributor.author	Kane, John
dc.date.accessioned	2016-12-01T09:56:59Z
dc.date.available	2016-12-01T09:56:59Z
dc.date.issued	2012
dc.identifier.citation	John Kane, 'Tools for analysing the voice : developments in glottal source and quality analysis', [thesis], Trinity College (Dublin, Ireland). Centre for Language and Communication Studies, 2012, pp 257
dc.identifier.other	THESIS 10336
dc.description.abstract	This thesis documents a range of research carried out on the topic of glottal source and voice quality analysis. Initially, a review is given of the physiological and acoustic correlates of different vocal settings. This is followed by a discussion of the importance of glottal source and voice quality variation in spoken communication, and the impact of modelling these aspects on speech technology. Despite the potential benefit of acoustic characterisation of the glottal source for speech technology existing algorithms often suffer from a lack of robustness. To address this, the present thesis describes and evaluates a set of novel algorithms aimed at improving the robustness. The algorithms come under two headings; fine-grained, glottal synchronous methods and coarse-grained, voice quality detection methods. In terms of fine-grained methods a new algorithm, SE-VQ, has been developed which is optimised for analysis of a range of voice qualities. While maintaining the precision of the state-of-the-art on neutral speech, the new algorithm is shown to signihcantly improve performance on creaky voice regions. SE-VQ is then utilised as part of a novel LF model based parameterisation method (DyProg-LF) of estimated glottal source signals. The dynamic programming algorithm used in DyProg-LF is shown to avoid the coimnon problem of inconsistencies in parameter trajectories and is shown to provide better parameterisation than the state-of-the-art on both a carefully controlled dataset with manually obtained reference values as well as on a larger speech dataset. For coarse-grained methods, a new parameter, the Maxima Dispersion Quotient (MDQ), is proposed for discriminating breathy to tense voice. MDQ was shown to outperform existing parameters for discriminating the voice qualities, particularly for continuous speech, and also in terms of robustness to additive noise. A new method for detecting creaky voice is also described which utilises two parameters derived from the Linear Prediction-residual signal. These parameters are used as input features to a decision tree classifier which is shown to significantly outperform the state-of-the-art on a range of speech data varying in terms of speaker, gender, language, recording condition and speaking style. Finally, a software package, the Voice analysis toolkit, which contains the algorithms developed as part of this thesis, has been made publicly available. This has been done to encourage usage of the newly developed algorithms in applied work and future algorithm evaluations.
dc.format	1 volume
dc.language.iso	en
dc.publisher	Trinity College (Dublin, Ireland). Centre for Language and Communication Studies
dc.relation.isversionof	http://stella.catalogue.tcd.ie/iii/encore/record/C__Rb15661945
dc.subject	Linguistic, Speech and Communication Sciences, Ph.D.
dc.subject	Ph.D. Trinity College Dublin
dc.title	Tools for analysing the voice : developments in glottal source and quality analysis
dc.type	thesis
dc.type.supercollection	thesis_dissertations
dc.type.supercollection	refereed_publications
dc.type.qualificationlevel	Doctoral
dc.type.qualificationname	Doctor of Philosophy (Ph.D.)
dc.rights.ecaccessrights	openAccess
dc.format.extentpagination	pp 257
dc.description.note	TARA (Trinity’s Access to Research Archive) has a robust takedown policy. Please contact us if you have any concerns: rssadmin@tcd.ie
dc.identifier.uri	http://hdl.handle.net/2262/78032

Files in this item

Name:: Kane TCD THESIS 10336 Tools for.pdf
Size:: 144.7Mb
Format:: PDF

View/Open

Name:: license.txt
Size:: 3.419Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

Centre for Language and Communication Studies (Theses and Dissertations)
CLCS (Theses and Dissertations)
Trinity College Dublin Theses & Dissertations

Show simple item record

Browse

My Account

Tools for analysing the voice : developments in glottal source and quality analysis

Files in this item

This item appears in the following Collection(s)